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Abstract 

A proper framework for measuring and mitigating risk in dynamic settings is of utmost importance, 
on both a practical, as well as a theoretical level. In recent years, coherent risk measures have emerged 
as a viable alternative to classical frameworks involving expected utility theory; their properties and 
axiomatic representation theorems are well understood in static settings, and have recently been ex- 
tended to dynamic decision problems. In practice, however, since the specification of the resulting 
dynamic risk measures is typically much more involved, single-period static risk metrics are often used 
to assess dynamic risk, due to their conciseness and ease of interpretability. This can lead to an over 
or under-estimation of the true dynamic risk, as well as potentially inconsistent behavior. 

In this paper, we investigate two different frameworks for assessing the risk in a multi-period decision 
process: a dynamically inconsistent formulation (whereby a single, static risk measure is applied to 
the entire sequence of future costs), and a dynamically consistent one, obtained by suitably composing 
one-step risk mappings. We characterize the class of dynamically consistent measures that provide a 
tight approximation for a given inconsistent measure, and discuss how the approximation factors can be 
computed. For the case where the consistent measures are given by Average Value- at- Risk, we derive 
a polynomial-time algorithm for approximating an arbitrary inconsistent distortion measure. We also 
present exact analytical bounds for the case where the dynamically inconsistent measure is also given 
by Average Value-at-Risk, and briefly discuss managerial implications in multi-period risk-assessment 
processes. Our theoretical and algorithmic constructions exploit interesting connections between the 
study of risk measures and the theory of submodularity and lattice programming, which may be of 
independent interest. 

1 Introduction 

Recent years have witnessed numerous examples of poor risk management practices, surfacing in a variety 
of areas of human activity. From the financial meltdown of 2007 and 2008, to the large-scale recalls 
operated by Toyota or the Deepwater Horizon oil rig blowout, the question of adequate measurement, 
presentation and mitigation of risk has been a hot topic of discussion and a source for academic research 
and mainstream media coverage, alike. 

While actual, the topic of risk is certainly not new. Classical paradigms studied in decision theory 
ascribe risk by means of a concave utility function, so that a risk-averse decision maker's goal becomes 
to maximize his expected utility under a particular probability distribution (see, e.g.. Von Neumann and 
Morgenstern [56], Savage [47]). This approach has proven extremely influential in fields as diverse as 
microeconomic theory, decision analysis, operations research, management science, and has witnessed 
many extensions and developments. From a normative perspective, it has been argued that the main 
pitfall of the expected utility paradigm is that it requires the complete specification of a utility function 
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and knowledge of the (unique) probability distribution governing outcomes. From a descriptive point of 
view, the theory fails to explain several well-known "paradoxes" of choice under uncertainty (Allais [5], 
Ellsberg [19], Kahneman and Tversky [30]). This has lead to several relevant modifications and extensions 
of expected utility theory (see, e.g., Simon [54], Kahneman and Tversky [30], Quiggin [37], Machina 
[33], Yaari [60], Gilboa and Schmeidler [27], Schmeidler [49] and the recent book Wakker [57] for more 
references), which typically involve a transformation of the underlying payoffs, probability distribution 
or utility function. 

The topic of risk has also been dealt with extensively in the field of mathematical finance. A standard 
benchmark, popularized in the late 1990s and included in the Recommendations of the Basel Committee 
on Banking Supervision of 2006 [10] , is the quantile-based Value at Risk (or VaR) method. For a particular 
measurement period (horizon) and a (typically small) probability level e, it is defined as the loss in market 
value over the respective horizon that is exceeded with probability e [17]. Despite its popularity, the VaR 
metric suffers from several known pitfalls, such as its ignorance with respect to the magnitude of the 
losses (in the e-tail of the loss distribution), and the fact that it does not reward risk diversification. 

In recognition of these issues, and acknowledging the need for a more systematic approach to risk 
measurement, Artzner et al. [6] introduced a set of axioms that any desirable risk metric should satisfy. 
These properties are monotonicity, translation invariance, subadditivity and positive homogeneity, and 
any risk measure satisfying all four is deemed to be coherent. As an example, it is known that VaR is not 
coherent (since it lacks subadditivity), but the related measure of average value at risk (AVaR), which 
examines tail expectations, is. Artzner et al. [6] proved that any coherent risk measure can be represented 
as an expected value taken with respect to a worst-case probability measure, where the latter is chosen 
by an adversary out of a set of allowable measures. The same paper also argued how any risk measure 
can be seen as arising from a set of acceptable outcomes, whereby a particular cost is deemed acceptable 
if and only if the risk associated with it is non-positive, and the riskiness in any cost is given by the 
(constant) reduction that would make it acceptable. 

Since the seminal paper of Artzner et al., several variations and extensions of coherent risk measures 
have been introduced and studied in the literature (see, e.g., Wirch and Hardy [59], Rockafellar and 
Uryasev [41, 39], Wang [58], Follmer and Schied [22], Kusuoka [31], Acerbi [2], FoUmer and Schied 
[23], Frittelli and Gianin [24], Eichhorn and Romisch [18], Ruszczynski and Shapiro [46], as well as the 
books Follmer and Schied [23], Pfiug and Romisch [36] and Shapiro et al. [53] for more references). 
Most notable among them are the convex risk measures [22, 24, 32] (which correspond to relaxing the 
axioms of subadditivity and positive homogeneity into the weaker requirement of convexity), as well as 
the distortion or spectral risk measures (see, e.g., [59, 58, 2, 31, 55] and the references therein). Several 
interesting connections have also been drawn between risk measures and utility functions, by means of 
optimized certainty equivalents (Ben-Tal and Teboulle [11]). 

Most of the above papers treat the case of a static (i.e., single-period) risk measurement. In most 
real- world settings of interest, the underlying problems are dynamic, requiring an assessment of risk at 
multiple points in time. For instance, financial institutions are routinely required to assess the riskiness 
of their portfolios at the end of each trading day. Similarly, a manufacturer would typically desire a 
dynamic assessment of the risks in its future revenue streams, so as to refiect potential changes in its 
procurement or credit cost structure, or in its realized sales. 

A naive approach to dynamic risk assessment at an intermediate time t would be to apply a static 
risk measure to the total future costs accumulated over the remaining problem horizon^ , and conditioned 
on the available information at time t. However, this could immediately lead to inconsistencies in the 
decision process, whereby risk preferences would change in a seemingly irrational fashion between con- 
secutive assessment periods. A simple example adapted from Artzner et al. [7] that exhibits the dynamic 
inconsistency of a naive AVaR is illustrated in Figure 1, and described below. 



^This could be the case when the latter costs correspond to a locked-in financial position [7]. 
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Example 1.1. Consider the tree in Figure 1. For the random cost Y satisfying y(UU) = 10, y(UM) = 
-12, y(UD) = -14, y(DU) = 20, y(DM) = -22, y(DD) = -22, under a uniform probability distribu- 
tion, it can be checked that AVaR2/3(y|U) = AVaR2/3(y|D) = —1, while AVaR2/3(y) = 1. In particular, 
under the risk-measure AVaR2/3, position Y is acceptable in every state of the world in stage t = 2, 
but it is unacceptable at t = 1. 

uu 




Figure 1: Example showing dynamic inconsistency of the risk measure AVaR2/3. The loss is acceptable 
in nodes U and D, but unacceptable in node R. 

We remark that there is nothing special about the particular choices in our example, i.e., similar 
constructions could be devised for other static risk measures, and for different sample spaces or probability 
measures. The main reason why the inconsistency arises is the very specification of the dynamic risk 
measurement process at the different time points. 

From a purely pragmatic point of view, however, the method suggested above seems eminently sen- 
sible. The exact way of measuring the risk of future costs is easy to explain at any node in the sample 
space, and the risk preference can be specified in a very compact fashion (e.g., at the root node R, by 
giving a single static risk measure), which has the advantage of being easier to calibrate from observed 
preference data, potentially leading to a wider adoption in practice. 

In order to correct for such undesirable effects, one has to impose additional conditions of the risk 
measurement process at distinct time periods. Such requirements have been discussed extensively in 
the literature on decision theoretic models and dynamic risk theory (see, e.g., Epstein and Schneider 
[20], Riedel [38], Cheridito et al. [13], Artzner et al. [7], Detlefsen and Scandolo [16], Roorda et al. [43], 
Ruszczynski and Shapiro [45], Follmer and Penner [21], Roorda and Schumacher [42], Ruszczyhski [44] 
and the references therein). While there are several notions of dynamic consistency (or time consistency, 
as it is sometimes called), most authors agree on the basic interpretation of the requirement, namely that 
if a certain outcome is considered less risky in all states of the world at time t + 1, then it should also be 
considered less risky at time t. From a representation standpoint, it can be shown (see [20, 38, 43, 21, 44]) 
that any risk measure that is dynamically consistent is obtained by composing one-step conditional risk 
mappings. In the context of the simple Example 1.1, this would imply that a risk measurement at node 
R should consist of a risk measure applied to (conditional) risk assessments at nodes U and D. 

Prom a pragmatic perspective, compared with the naive (and dynamically inconsistent) risk measure- 
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ment suggested before, the compositional representation entails a considerably more complicated form of 
risk assessment, whereby, to calculate the risk of a future cost sequence, one needs to specify single-period 
conditional risk mappings for every future time-point (and every node of the sample space). Furthermore, 
our conversations with managers revealed a certain feeling that such a measurement process could result 
in "overly conservative" assessments, since risks are compounded in time (for instance, in Example 1.1, 
a dynamically-consistent AVaR requires computing conditional tail expectations of quantities that are 
already conditional tail expectations of end costs). A theoretical underpinning for this effect has been 
suggested in the recent paper by Shapiro [52] , which formalizes a precise sense in which a compositional 
risk measure can be more conservative than a naive (static) one. The paper does not, however, provide 
reverse conditions, and does not fully relate the two risk assessments. 

With this motivation in mind, the goal of the present paper is to better understand the tradeoffs 
between the consistent and naive (inconsistent) risk measurements. We view the paper as a step towards 
a systematic way of constructing risk-adjusted objective functions that are compatible with the modern 
theory of risk measures, but are also computationally tractable and "easy" to calibrate and explain to 
managers. Our contributions are as follows. 

• We formalize the problem of comparing a dynamically consistent distortion risk measure and 
an inconsistent distortion risk measure from the perspective of the first period in a multi-stage 
decision process. 

• We describe necessary and sufficient conditions that guarantee when fj,c underestimates the risk as 
compared with /x/. For any such pair of measures, we characterize the class of scaling factors a 
such that a ■ overestimates risk as compared with /i/. We call the smallest such factor the price 
of dynamic consistency a*, reflecting the amount by which an inconsistent assessment of capital 
requirement (under ^i/a*) would have to be scaled in order to ensure that it meets the dynamically 
consistent requirements (under ^c)- 

• We show that, for general distortion risk measures a-nd fij, it is NP-hard to assess the price of 
dynamic consistency. However, when fic is given by a dynamically consistent version of the AVaR 
risk measure, and the probability measure is uniform, then the price of consistency can be computed 
in polynomial-time for any inconsistent distortion measure 

• For the two-period case where both fj,c and /i/ are given by AVaR risk measures, we derive an exact 
analytical expression for the price of dynamic consistency, and we also characterize the risk measure 

that would result in the smallest price of consistency. Our results provide an interesting man- 
agerial insight, suggesting that, in order to minimize the price of dynamic inconsistency, one should 
always be more conservative in the later stages of the dynamically consistent risk measurement 
process. However, the extent of this effect depends on the initial degree of risk aversion in the naive 
measurement process - if one is highly risk averse, then it may actually be optimal to measure the 
risk in the same way in each stage of the dynamically consistent process. 

The rest of the paper is organized as follows. Section 2 provides the necessary background in static and 
dynamic risk measures, and introduces the precise mathematical formulation for the questions addressed 
in the present paper. Section 3 discusses the case of arbitrary consistent and inconsistent distortion risk 
measures, and characterizes the resulting price of dynamic consistency. Section 4 discusses the analytical 
results for the case of AVaR, and Section 5 contains an analysis of the computational complexity for 
finding the price of dynamic consistency. Section 6 concludes the paper. 

1.1 Notation 

With i < j, we use [i,j] to denote the index set {«,... For a vector x e M" and j e {1, . . . ,n}, we 
use Xj to denote the j-th component of x. For a set 5 ^ {1^ ... ^ n}, we let x{S) XlieS ^^so, we use 
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xs e M" to denote the vector with components Xi for i e S and otherwise (e.g., I5 is the characteristic 
vector of the set S), and x\s 6 mI*^! to denote the projection of the vector x on the coordinates i s S. 
When no confusion can arise, we denote by 1 the vector with aU components equal to 1. We use x'^ for 
the transpose of x, and x^y = ^iVi ^he scalar product in M". 

For a set or an array S, we denote by 11(5) the set of all permutations on the elements of S. vr(5) 
or (t{S) designate one particular such permutation, with 7r(z) denoting the element of S appearing in the 
i-th position under permutation vr. 

We use A" to denote the probability simplex in M", i.e.. A" {p 6 M" : l^p = 1}. For a set 
P c M"-j we use ext(P) to denote the set of its extreme points. 

Throughout the exposition, we adopt the convention that 5 = 1- 

2 Consistent and Inconsistent Risk Measures 

In this section, we provide a brief review of the relevant material in risk theory, and we formalize the 
main question addressed in the current paper. 

2.1 Probabilistic Model 

We begin by describing the probabilistic model. Our notation and framework are closely in line with that 
of Shapiro et al. [53], to which we direct the reader for more details. 

Consider a scenario tree representation of the uncertainty space, t e [0, T] denotes the time (f = 
corresponds to the the root node), Vlt represents the set of nodes at stage t e [0, T], and denotes the set 
of children of node i 6 Oj^. With the set JIt of elementary outcomes, we associate the cr-algebra Ft = 2^^ 
of all its subsets, and we consider the filtration J-q <^ Ti <^ ■ ■ ■ <^ J-t, where for any t e [0, T — 1], J^i 
denotes the sub-algebra of Ft+i that is generated by the sets "^j, i e VLt. In other words, the set {"^ilien^ 
forms the elementary events of J-t (and there is a one-to-one correspondence between the elementary 
events of J-t and the set of nodes Vtt at time t). We also note that J-q = {0, 0^}, the trivial u-algebra. 

We introduce a probability measure by specifying a vector of conditional probabilities Pj 6 ^Y^A for 
every i e Vlt, and for every t e [0, T — 1]. Note that {^i, i'^\Pj) now represents a valid probability space'^. 
The conditional probabilities Pj induce a joint probability measure p 6 aI^^I over the leaf nodes, which 
we also refer to as the reference measure. To avoid trivial situations, we typically take p > (otherwise, 
all the arguments can be repeated on a tree where leaves with = 0, i 6 are removed). 

On the space {^}t,J~t,p), we use to denote the space of all functions Xt ■ i^T ~^ 1^ that are 
J^r-measurable. Since any such function can be identified with a real vector in rI^^I (i.e., the space A/y^ is 
isomorphic with RI^^I), we denote by Xt the random variable, and by Xt the vector in rI^^I of induced 
scenario- values. To this end, we identify the expectation of Xt with respect to any measure q e AI^^I as 
the scalar product of Xt and q, i.e.. Eg [Xt] = q^ Xt- In a similar fashion, we introduce the sequence 
Xt, t e [0, T— 1], where Xt is the sub-space of Xt containing functions which are J^^-measurable. Note that 
any function Xt e Xt is constant on every set "^j, i e 0,t, so that Xt can also be identified with the vector 
Xt e RI^*I. We also introduce X[t,r] = 

X • • • X Xr, for any t ^ t, and, similarly, X^t,T] = i^t, ■ ■ ■ ,Xt-). 

2.2 Static Risk Measures 

Consider a discrete probability space (0, J^, P), and let X he a linear space of random variables on 0, 
typically restricted to be a subspace of U'{Q,J^,F), for some p > 1. In the context of the scenario tree 
outlined in Section 2.1, examples of such a space could be {Qt-,J^t,F)-, but also ("^j, 2'^',Pj), i e Qt, t e 

^In other words, {'^i, i G Qt} represents a partition of all the nodes in flt+i, V i G {0, . . . , T — 1} 

^Here and throughout, we adopt the same convention discussed in Section 1.1, whereby for any vector p e R'"*"'' and any 
SG2l*-l,p(5)=2^^,sP,. 
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[0, r — 1]. On such a space, we can define a risk measure as any function ^ : X —> M. satisfying the 
fohowing two properties: 

[PI] Monotonicity. For any X,Y € X such that X ^Y, ii{X) ^ ^i{Y). 

[P2] Translation invariance. For any X s X and any c 6 M, ^{X + c) = ^{X) + c. 

A risk measure can be interpreted as the smallest cost-reduction that is needed to make a cost appear 
as "acceptable". More precisely, if ii{X) ^ 0, then no reduction from X is needed, and X is already 
acceptable. In view of this, monotonicity essentially requires that a cost X that is always greater than 
a cost Y should always be treated as being riskier, and should therefore require a larger reduction. 
Translation invariance (also known as cash invariance) suggests that a deterministic cost increase by an 
amount c should result in a required cost reduction that is also increased by c. The latter property also 
implies that //(X — = 0, i.e., ijl{X) is exactly the smallest amount of cost reduction needed to 

make X acceptable. 

Two well-known examples of risk measures are Value-at-Risk (VaR) and Average Value-at-Risk (AVaR, 
also known as Conditional Value-at-Risk, Tail Value-at-Risk, or Expected Shortfall). They are defined as 
follows (Follmer and Schied [23]): 

VaReiX) = inf{ meR : F[X - m > 0] ^ e } (la) 

1 

AVaRe{X) = -\ YaRt{X)dt. (lb) 

VaRe(X) represents the smallest amount m by which the cost/loss X should be decreased so that the 
new loss X — m is acceptable with probability (under the reference measure) of at least 1 — e. AVaRe, 
as the name suggests, represents an average of VaRj measures, where the level t is taken to be at least 
e. When the underlying reference measure P is non-atomic, it can be shown (Follmer and Schied [23]) 
that AVaRsi^) = Ep [X | X ^ VaR£(X)], which motivates the second and third names that the measure 
bears. 

In addition to the two axioms above, it is customary to require additional properties from a risk 
measure. In particular, Artzner et al. [6] introduce the notion of coherent risk measure to represent any 
risk measure fi that additionally satisfies the properties: 

[P3] Convexity. For any X,Y eX, and any A 6 [0, 1], n{XX + (1 - X)Y) ^ A/i(X) + (1 - A) fi{Y). 
[P4] Positive homogeneity. For any X e X, and any A ^ 0, ij,{XX) = A f^{X). 

The intuition for the requirements is straightforward. The convexity property suggests that diver- 
sification of costs should never increase the risk (or, conversely, that any convex combination of two 
acceptable costs X and Y should also be acceptable) , while the positive homogeneity axiom suggests that 
risk should scale linearly with the size of the cost. Sometimes, only the convexity axiom is imposed in 
addition to [PI] and [P2], which leads to the notion of convex risk measures [22]. Of the two measures 
introduced earlier, it can be shown that AVaR is a coherent risk measure, while VaR is not (since it fails 
the requirement of convexity). 

A central result in the theory of risk measures is the following representation theorem, which any 
coherent risk measure must obey. 

Theorem 2.1 (Representation Theorem for Coherent Risk Measures [6]). A function /j, : X ^ M. is a 

coherent risk measure if and only if there exists a family of probability measures Q on (17, J^) absolutely 
continuous'^ with respect to P, such that 

^i{X) = sup Eq [X] , V X e X. (2) 

QeQ 

*Since our exposition is only focused on discrete probability spaces, and we always take the reference measure P to satisfy 
P(ai) > 0, Vol G n, the condition is trivially satisfied, and can be dropped from the statement. 
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The result essentially states that any coherent risk measure can be seen as an expectation taken with 
respect to a worst-case measure, which is chosen from a suitable set of test measures (or generalized 
scenarios) Q. For instance, the coherent risk measure AVaR^ has the associated set of test measures [23] 

QAVaR, = { Q 6 AI^I : Q(^^) ^ ^ P(^), Vo; 6 1^ }. (3) 

While convexity is generally considered an eminently sensible requirement (Follmer and Schied [22]), 
it may reward diversification even among costs that always vary together. More precisely, one might 
conclude that the risk associated with the total cost X + Y is strictly smaller than the sum of the 
individual risks, even when X and Y are comonotone, i.e., [^(w) — [^(w) — y(a;')] ^ 0, for any 

uj,ljj' € Q. To correct for this undesirable effect, a typical axiom that is required of a risk measure /x is 
that of comonotonicity, i.e., 

[P5] Comonotonicity. ^[X + Y) = fi{X) + /u(y) for any X,Y e X that are comonotone. 

It is known that comonotonicity actually implies positive homogeneity (see Follmer and Schied [23]), so 
the class of coherent and comonotonic risk measures is, in fact, identical to that of convex and comono- 
tonic risk measures (however, not all coherent risk measures are comonotonic - see Acerbi [3] for a 
counterexample) . 

Risk measures that are comonotonic are known to have a special representation in terms of integrals 
of Choquet capacities. To this end, we introduce the following terminology. 

Definition 2.1. A set function c : 2^ ^ [0, 1] is said to be a Choquet capacity if it satisfies the following 
properties: 

• nondecreasing; if c{A) ^ c{B), ^ A Q B Q Q 

• normalized; if c(0) = and c{^}) = 1 

• submodular.- if c{A n B) + c{A u B) ^ c{A) + c{B), A,B^Q.. 

If the normalization condition did not require c{Vt) = 1, then the notion of a Choquet capacity would 
exactly correspond to that of a rank function of a polymatroid, a concept that has been extensively studied 
in the field of combinatorial optimization (see, e.g.. Chapter 2 of Fujishige [25]). Since this additional 
requirement does not really impact any of the main results concerning the theoretical and computational 
properties of the function c (in effect, it can be immediately enforced by a simple scaling, for any c that 
is not identically zero), we actually use the names Choquet capacity and rank function of a polymatroid 
interchangeably throughout the current paper. 

With this definition, the following representation theorem^ is known to hold for any convex and 
comonotonic risk measures (Schmeidler [48], Follmer and Schied [23]). 

Theorem 2.2 (Representation Theorem for Convex, Comonotonic Risk Measures). A convex risk mea- 
sure fj, is comonotonic if and only if it arises as the Choquet integral with respect to a monotone, normal- 
ized, submodular set function c. In this case, 

/x(X) = maxEQ[X], (4) 
where Qc = {Qe AI^I : Q{S) ^ c{S), V5g J-}. 



^We write a simplified version here, suitable for a discrete probability space. For the more general version, involving 
Choquet integrals, refer to Schmeidler [48] and Chapter 4 in Follmer and Schied [23]. 
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It is worth noting that, since any Choquet capacity c is normahzed, the constraint Q e AI^I is 
equivalent to Q ^ 0, Q(r2) = c(Q) = 1. Therefore, the polytope Qc can be written equivalently as 
= nMpI,, where 

Be = { Q 6 rI^I : Q{S) ^ c{S), \fS e T, Q{n) = c{n) } 

denotes the base polytope corresponding to the polymatroid rank function c (see Section 7.2 in the 
Appendix and Chapter 44 in volume B of Schrijver [51]). In fact, since c is always nondecreasing, it can 
be shown that Be c m|^' (see Corollary 7.1 in the Appendix), so that 

Qc = Be. (5) 

This analogy will prove very useful in our analysis, since it will enable a discussion of the properties of 
Qe by making reference to theorems known for base polytopes of submodular functions. Section 7.2 of 
the Appendix contains all the results that are relevant for our treatment, and the interested reader is 
referred to Fujishige [25] and volume B in Schrijver [51] for a very comprehensive discussion of the topic. 

As an example of the representation in Theorem 2.2, the Choquet capacity corresponding to the risk 
measure AVaR^ is given by CAVaRe('S') = min(l, ^^^) (see Chapter 4 of Follmer and Schied [23]). In fact, 
since all the constraints Q{S) ^ 1, ^ S Q Q are implied by Q ^ 0, Q(J1) = 1, one could equivalently use as 
Choquet capacity CAVaRg 

(5) = ^ for defining AVaR^ (this can also be seen directly from equation (3)). 
When the reference measure P is uniform, i.e., P = pi , this yields the following equivalent representations 
for the set of measures determining AVaR^ 

QAVaR. = |q e M^l : QiS) ^ min(l, ^ S Q 

'gAI^I : Q{S) p^' ^-5^^ 

^eAl^l : Q({^})^ J|-,Vz6f]|. (6) 

A final attribute of risk measures, which we also require in our analysis, is law-invariance. 
[P6] Law-invariance. /u(X) = fJ.{Y) for any X,Y e X such that Fx{-) = Fy{-). 

In words, law-invariant risk measures only depend on the probability distribution of the random variables 
involved, and not on other elements on the probability space (e.g., 0, or J^). This is a very reasonable 
requirement to impose in practice, since any risk measure that is not law-invariant cannot be estimated 
from empirical data, raising the question of its practical usefulness (see Acerbi [3] ) . With this terminology 
in place, we now define the main object of interest in our treatment. 

Definition 2.2. A coherent risk measure that is also comonotonic and law invariant is called a distortion 
risk measure. 

The class of distortion risk measures (also called spectral risk measures Acerbi [2], Acerbi [3]) has 
been examined in numerous papers in the fields of actuarial finance (e.g., Wang [58]), determination of 
capital requirements (Tsanakas [55]), of margin requirements in clearing houses Cotter and Dowd [15], 
or in the general theory of (financial) risk management (Kusuoka [31], Acerbi [2] etc.). 

A particularly relevant way of generating distortion risk measures is to consider Choquet capacities 
obtained as 

c(5) = ^(P(5)), V5 6^, (7) 

where ^ : [0, 1] [0, 1] is any concave, nondecreasing function satisfying ^'(0) = and ^'(1) = 1. 
This representation furthermore emphasizes the choice of the name for this class of measures, since the 
function ^ can be seen a distortion of the underlying probability P for any event S. For a more detailed 
interpretation of these axioms, as well as a discussion of their relevance in financial settings, we direct 
the reader to the papers [6, 4], as well as the book [23], and the references therein. 
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2.3 Dynamic Risk Measures 

We now turn our attention to the question of dynamic risk measures, i.e., risk measures defined for cash 
streams that are received or dispensed across several time-periods. While the axiomatic treatment of risk 
measures in static settings is quite well understood, and there is generally a mutual agreement over what 
constitutes a sensible risk metric, the situation in dynamic settings is more difficult and contentious. 

An immediate complication in dynamic settings is that even the specification of the risk preferences 
no longer entails constructing a single risk measure /x, but rather a sequence of risk measures ^t,T '■ 

X[t,T] ~^ ^t, which map a future stream of random costs -'^[i,r] = i^t, ■ ■ ■ , ^r) into a risk measurement 
/ assessment at time t. This conditional risk measure can be interpreted as the fair J^^-measurable cost- 
reduction that would make the future cost sequence -'^[t,r] acceptable at time t. Since this assessment has 
to be made for any t e [0, T — 1], a dynamic risk measure must specify the entire sequence of conditional 
risk measures {l-it,T}f-_Q (note that, at time T, all uncertain quantities are already fully known). 

A main point of contention in dynamic problems is the question of the consistency of the risk prefer- 
ences over time. Recall from Example 1.1 in Section 1 that taking ^t,T as a static risk measure conditioned 
on the filtration J^t can easily result in dynamically inconsistent behavior, whereby positions that are ini- 
tially unacceptable at time t become acceptable in all states of the world at time t + 1. Several notions of 
dynamic (or time) consistency are possible (see Penner [35] and Acciaio and Penner [1] for an overview), 
and each yields slightly different representation theorems and conclusions regarding the risk measures 
Ht^T- The notion that we adopt here is closest in spirit to that of strong dynamic consistency (Acciaio 
and Penner [1], Riedel [38], Artzner et al. [7], Detlefsen and Scandolo [16], Roorda et al. [43], Cheridito 
et al. [13], Follmer and Penner [21], Ruszczynski and Shapiro [45], Ruszczyhski [44]), and is defined as 
follows. 

Definition 2.3 (Ruszczyhski [44]). A dynamic risk measure {nt,T}^^Q is called dynamically consistent 
(or time-consistent J if, for all0^t<T^T — 1 and all sequences ^[t,r]) ^[t,T] ^ '^[t,T]! conditions 

Xk = Wk,^ k = t, . . . ,T - I and Hr,T{X[r,T]) ^ lJ'T,T{W[r,T]) 

imply that 

IJ-t,T{X[t,T]) ^ lJ't,T{W[t^T])- 

In other words, if the X cost sequence is deemed less risky than the W cost sequence at some time in 
the future (r), and they yield identical costs from the current time (t) to the future time (r), then the 
X sequence should be deemed as less risky at the current time (t), as well. Recalling Example 1.1 from 
Section 1, it can be immediately seen that a naive application of (even coherent) static risk measures can 
result in inconsistent dynamic behavior. 

As such, more care is needed when defining appropriate (i.e., consistent) risk measures in dynamic 
settings. The following result has been shown in several different contexts (see [45, 44], and references 
therein) 

Theorem 2.3. Any dynamically consistent risk measure {fJ.t,T}^_i that satisfies the properties 

• J^t-measurable translation.- ^^^^(-^[t.T]) = Xt + fit,T{0, X^-i._^_i rp-^) 

• normalization.- ^[j,t] (0, • • • , 0) = 
can be written as 

IJ,t,T{X[t,T]) = Xt + fIt+1 (Xt+i + tIt+2{Xt+2 H \- 1^t{Xt) •••))' (^) 

where nt ■ ^ '^t-i, t ^ [1;^] ^'"e a set of single-period conditional risk mappings. 
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We note that this representation is quite general, in that it only requires the axiom of J-t-measurable 
translation and a (very natural) normalization axiom. In words, the result essentially states that any 
time- consistent risk measure has to have the compositional form (8), and that specifying the risk measure 
(preference) ^[t,T] at time t that would apply to a future cost sequence ^[t,T] can be done by prescribing 
all the conditional risk mappings /x^, t e [t + 1,T]. Note that a typical latter object (say, at time t ^ t) 
is a risk mapping, i.e., a multifunction from Xr+i into X-r- Therefore, from a practical perspective, 
specifying and calibrating such a risk measure (e.g., from behavioral observations or other empirical 
data) is generally quite cumbersome. 

Stronger results are possible, under more requirements posed on the one-step conditional risk mappings 
Ht appearing in the representation. For instance, by requiring J^j-measurable translation, i.e., that ^t{Xt + 
Xt+i) = Xt-\- ^t{Xt+i)i VXt 6 Ft, VXt+i e Ft+ii it can be shown (Ruszczyhski [44]) that the risk measure 
in (8) only depends on the total costs occurring in the future, i.e., 

/^[i,T](-'^[t,r]) = Att+l(/^i+2(- • • {^J'T{Xt + ■■■ + Xt)) . . .)j = {^lt+l o fJ't+2 o ■ ■ ■ o fiT){Xt + ■ ■ ■ + Xt). 

This is a framework that has been considered in a large body of literature [38, 7, 16, 43, 13, 21, 45, 44], 
and which we also adopt in the present paper. 

2.4 Model Description and Problem Statement 

We finalize the discussion in the current section by describing the exact risk model considered, and by 
formalizing the main question addressed in the present paper. 

Recall from the discussion in Section 2.3 that a dynamically consistent risk measure ^J'[o^T] that 
evaluates a future stream of costs -'^[o.t] = (-'^O;- • • iXt) from the perspective of time t = can be 
written as 

fJ'[0,T] (^[0,T] ) = (^1 o /i2 o • • • o ^j.) (Xo + • • • + Xt) , (9) 

where '■ -^t ~^ -^t-i are one-period conditional risk mappings, for any t 6 [1,T]. In the context 
of the scenario tree introduced in Section 2.1, every such conditional risk mapping is exactly given by 
l^t = (/Lt*)i60t, e [1,^], where : RI'^"! ^ M, Vi e represents a (conditional) risk measure associated 
with node i e Qt (see Shapiro et al. [53]). Since every ^u* is now a static (i.e., single period measure), it 
can be required to satisfy additional axiomatic properties, such as the ones introduced in Section 2. 2. This 
allows us to define one of the main objects of interest in the current paper. 

Definition 2.4 (Dynamically Consistent Distortion Risk Measures). A multi-period risk measure /U^.t] 
is said to be a dynamically consistent distortion risk measure if and only if it satisfies the following two 
conditions: 

• It obeys a representation of the form /i[o,r] (-^[i,r]) = il^i o ^2° ■ ■ ■ ° ^t)(-'^o + • • • + Xt) 

• For every t 6 [l,r], /Xf = (/i*)^^^^, and every /i* is a distortion risk measure. 

As mentioned in the introduction, the main motivation for our paper is to compare such dynamically 
consistent risk formulations and naive applications of "sophisticated" static (i.e., single-period) risk mea- 
surements, which can be potentially inconsistent. In particular, we consider a problem where a stream of 
future costs -^[o,t] is evaluated from the perspective of time t = \, and two potential risk specifications 
are available: 

• (Inconsistent) A static distortion risk measure, /ij : Xt M, is applied to the total cost Xq -I- ■ • • -I- 
Xt- 

• (Consistent) A dynamically consistent distortion risk measure A*[o,t] • '^[o,T] ^ K is applied to 
^[o,T] • 
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In this context, the goal of the present paper is to take the first step towards understanding the 
tradeoffs between the two risk measurements (when is one larger than the other, by how much), as well 
as to attempt a systematic construction of a dynamically consistent measure starting with a dynamically 
inconsistent one, in a way that remains as "close" as possible to the inconsistent formulation. The reason 
why such a question is important is that, in practice, it may be easier to elicit a static risk function from 
a decision maker, while a complete specification for a multi-period process is much more difficult. A 
better understanding of the tradeoffs between the two, as well as a systematic way of building one from 
the other could then immediately provide a recipe for systematically assessing (and optimizing over) risk 
in dynamic decision problems. 

A first step in this direction, which is addressed in the present paper, is to provide a satisfactory 
answer to the following question. 

Problem 1. Given a static distortion risk measure fij and a dynamically consistent distortion risk 
measure ^[q^t] ; compute the smallest possible factor a such that 

fi[o,T]{Xo, . . . ,^t) ^ Ai/(^o + ■ • • + Xt) ^ a ■ /i[o,T](^o, • • • ,^t), VX* e Xt, Vt e {0, . . . ,r}. 

We call the tightest such a the price of dynamic consistency, reflecting the amount by which an 
inconsistent assessment of capital requirements (under ^j/a*) would have to be scaled in order to ensure 
that it meets the dynamically consistent requirements (under ^c)- In practice, it may be far easier to elicit 
a single (inconsistent) risk measure /// from a manager, or from observed preferences. By appropriately 
scaling such a risk measure with the factor a*, we would ensure that the resulting measurement would 
always correspond to a particular dynamically consistent assessment, which is axiomatically justified, 
as well as computationally attractive (it is known that dynamically consistent formulations allow an 
application of the Bellman recursions for Dynamic Programming - see, e.g., Nilim and El Ghaoui [34], 
Iyengar [29], Ruszczyhski [44]). 

We note that a similar concept of inner and outer approximations by means of distortion risk measures 
appears in Bertsimas and Brown [12]. However, the goal and analysis there are quite different, since the 
question is to approximate a single polyhedral uncertainty set (representing an otherwise arbitrary static 
risk measure) by means of an uncertainty set derived from a static distortion risk measure. 

We remark that, in view of the representation (8) for dynamically consistent measures, we have 
/^[o,T] (-'^o, • • • , Xt) = fJ,c{Xo + ■ ■ • + Xt) for a suitable fici so that the question above simplifies to 

^ic{Y)^^lI{Y)^^a■^lc{Y),^YeXT, (10) 

where Y =^ Xi + • ■ ■ + Xt denotes the total future cost. In the current setting, note that, if we insisted 
on nciY) ^ fJ'iiY) holding for any cost Y, and if iJ,c, m were allowed to take both positive and negative 
values, then the question in Problem 1 would be meaningless, in that no feasible a would exist satisfy- 
ing (10). To this end, we introduce the following standing assumption throughout the remainder of the 
analysis. 

Assumption 2.1 (Non-negative Losses). The stochastic losses Y are non-negative. 

This requirement is not too restrictive whenever a lower bound Yl is available for Y . By using 
the cash-invariance property (Property [P2] in Section 2.2) of the risk measures involved, one could 
reformulate the original question with regards to the random loss Y — Yl, which would be nonnegative. 
Furthermore, in specific applications (such as inventory management), Y is the sum of intra-period Xt 
that represent non-negative costs, so requiring Y to be nonnegative is quite sensible. 

3 Finding the Optimal a 

In this section, we discuss the necessary and sufficient conditions that yield the desired relation (10), 
which also lead us to a characterization of the optimal bound a. In order to simplify the notation and 
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maintain the clarity of the ideas presented, we mainly discuss a two-period model (i.e., T = 2). However, 
the methodology and results that we introduce immediately lend themselves to an arbitrary (but finite) 
number of time periods, an extension which we discuss towards the end of the section. 
The following proposition provides a concise characterization for the risk measure 

Proposition 3.1. The distortion risk measure fij obeys the following robust representation theorem 

Hi(Y) = max q'^Y,\fY E X2, (11a) 

Qj = {qs AI^^I : qiS) ^ c(S), \/ S Q ^2 } ^ B,, (lib) 

where the function c : 21^^ I ^ M is a Choquet capacity, and Be is the base polytope corresponding to the 
polymatroid rank function c. Moreover, ifY^O, then we also have 

UliY) = max q'^Y, (12) 

<5f6sub(Q/) 

where sub(Q/) = { X 6 ^1 : 3 q ^ Ql, X ^ q^ is the down-monotone closure of the polytope Qj. 

Proof. Representation (11a), (lib) follows directly from Theorem 2.2 for distortion risk measures (also see 
the discussion in Section 2.2 and Theorem 4.88 in Follmer and Schied [23]). The result concerning Y ^ 

follows by recognizing that, since Qj Q m}^^\ we have maxq^g^- q^Y = maXqgsub(Qj) Q^^^ V"K ^ (see 
Theorem 7.1 in the Appendix). □ 

The above result states that the polytope of probability measures determining the inconsistent risk 
metric is exactly the base polytope corresponding to the polymatroid rank function c. Furthermore, when 
examining only nonnegative losses Y, we can equivalently enlarge the set of measures Qj in the repre- 
sentation to contain all nonnegative vectors component- wise smaller than vectors in Qj. The polytope 
sub(Q/) no longer contains only valid probability measures, but, nonetheless, does not change the risk 
evaluation, according to (12). For an example of the polytopes Qi and sub(Q7-), please refer to Figure 2. 




Figure 2: Example of Q/ and sub(Q/). Here, \Q,2\ = 3 and c{S) = min(l, Jj-^). 
A similar result can be provided for the dynamically-consistent risk measure nc, as follows. 
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Proposition 3.2. The dynamically consistent measure nc is given by 

ficiY) = max q^Y, s X2, (13a) 
qeQc 



Qc = 



( 101 101 p{S) i^ci{S),^ S QQi ) 

I q{T) iip,-C2\iiU),^U Q^^,^^enl} 

^ { q 6 Mp^l : 3p 6 i3e, : q^^, 6 ^p,.,,,^, Vi 6 f^i }, (13b) 

where ci : 21^^ I ^ M and C2\i '■ 21''^' I ^ M, V? 6 f^i are Choquet capacities, and BcijBc^^i are the base 
polytopes corresponding to the functions ci and C2\i, respectively. Moreover, if Y ^ 0, then ficiY) = 
maXqgsub(Qc) 1^^' where snh{Qc) is the down-monotone closure of Qc- 

Proof. In view of Definition 2.4, the dynamically consistent distortion risk measure fic can be written as 
A*i ° where //i : A^i ^ M is the first-period distortion risk measure, and /x* : RI"^*! ^ M, Vz 6 are the 
(conditional) distortion risk measures yielding the risk mapping fi2 = (M*)i60i- Furthermore, using the 
representation in Theorem 2.2 for distortion risk measures yields 



/ii(Xi) = max p Xi, VXi 6 A'l, 

Qi '='{pe Al^il : p{S) ^ c,{S), V5 ^ 0^ }, (14) 

and similarly, for any i e ^li, 



^1{X2) = max q^Xs, ^ X2 e 



2, 



Q2|/= {qeAl^^l : q{U) ^ C2\,{U),^ U q|^^\.^^=0}. (15) 

In particular, recalling the interpretation at the end of Section 2.2, we have Qi = Bc^. Similarly, it can 
be seen that the projection of the polytope Q2|i 011 the coordinates is exactly given by Bc^,^., for any 
i s Vti. From these relations, we have that ^c{Y) = max^^g q^Y, where the set Q has the following 
product form structure (see Chapter 12 of Shapiro et al. [53] for similar ideas): 

Q = {q6Al^2l : 3 p 6 Qi, 3 q' e Qa^, V i e S^i, such that q = J]piq']. (16) 

We now show that Q = Qc, by double inclusion, "c" Consider any q e Q, and let p € Qi and 
^ Q2\i denote the corresponding vectors in the representation above. Since qc^. = ■ q', Vi e Oi, 
and q* e Q2\i, Vz e fii, we trivially have that p and q satisfy the conditions (13b), and hence q e Qc. 
"3" Conversely, for any q 6 Qc, and with p satisfying the constraints (13b), it can be readily checked 
that q* = 6 Q2\i (the only constraint that is not trivial is I'q* = 1, but this must be true, since, 
otherwise, we would have XlieOi 9(^«) = YjiefiiPi 1'^* < P(^i) = 1; which would contradict q 6 AI^^I)^ 
Therefore, q = XliefJi ^' ^ ^' ^'^^ completeness, we also note that q* = 6 Q2|j if ^i^d o^ily if 

e i3c2|j ^ q|<^i e ;Sp..c2|j, for any i e Qi (by part (iv) of Theorem 7.4 in Section 7.2 of the Appendix). 
The proof for the case y ^ is analogous to that in Proposition 3.1, and involves a trivial application 
of Theorem 7.1 in the Appendix. □ 

As expected, the set of product measures Qc involved in the representation of nc has a more compli- 
cated structure than Qj, and is given by a single base polytope. However, when discussing nonnegative 
losses, the down-monotone closure sub(Qc') plays exactly the same role as that of sub(Q/) in Propo- 
sition 3.1, in that using these down- monotone closures of the sets of measures does not alter the risk 
evaluation. Given the central role played by the latter two polytopes in the remainder of our analysis, 
we seek a convenient representation for their inequality descriptions. The following two lemmas provide 
the steps needed towards this end. 
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Lemma 3.1. The downward monotone closure of the polytope Qi is given by the polymatroid associated 
with the function c, 

sub(Q7) =Pe = {geffi^'' : qiS)^ciS),\/SQn2}- (17) 

Proof. As argued in Proposition 3.1, Qj = Be- Since, for any polymatroid rank function c, the downward 
monotone closure of the base polytope Be is exactly given by the polymatroid Vc (see Theorem 7.4 in 
Section 7.2), the conclusion is immediate. 

□ 

Lemma 3.2. The downward monotone closure of the polytope Qc is given by 
sub(Qc) = {qe Mp^' : 3p e P,, : q^^ e Vp^.c,^^, Vi e 

Proof. The fact that the two sets on the right are identical is immediate from the definition of the 
polymatroid associated with a rank function c (see Section 7.2). As such, denote by A the set on the 
right of (18). 

"c". Consider an arbitrary x 6 sub(Qc'). By definition, x ^ and 3 qr 6 Qc such that q ^ x. Let p 
correspond to q in the representation (13b) for Qc- To argue that a; 6 we show that the pair {p,x) 
satisfies all the constraints in (18). To this end, since p s Bci (and Bci c mI^^'), we immediately have 
p € Vci- Furthermore, \/ i € Qi and Vt/ Q ^i, we have x{U) ^ qiU) ^ pi ■ C2\i{U), which proves that 
X s A. 

"5". Consider an arbitrary q e A, and let p be such that the pair (p, q) satisfies all constraints 
in (18). Since p e Vc^ = sub(i3cj, 3p 6 such that p ^ p ^ 0. The pair (p, q) also satisfies all the 
constraints (18). In particular, q\<^\ 6 "Ppi-cji, = sub(;Spj.c2| J, for any i s Qi. Therefore, 3g e m|[^^' such 
that q|<^- 6 I3p^.c2^- and q\^^- ^ q\<^^ ^ 0, for any i € Cli. It can be readily checked that, by construction, 
the pair (p, q) satisfies all the constraints (13b) defining Qc- Therefore, with q e Qc and g ^ q ^ 0, we 
must have q s sub( Qc"). □ 

3.1 Conditions for ndY) ^ 

With the representations provided by Propositions 3.1 and 3.2, the consistent and inconsistent risk 
measures can be directly related by making reference to the underlying sets of probability measures (and 
their respectively downward monotone closures), as follows. 

Lemma 3.3. The following four statements are equivalent. 

(t) fic{Y)^fii{Y),^YeX2. 

(ii) Qc^Qi- 

(lii) fic{Y) ^ ixi{Y), vy 6 ^-2, y ^ 0. 

(iv) sub(Qc') ^ sub(Q/). 

Proof, "(i) 4» (ii)" Note that /// and ^c can be identified as the support functions of the sets Qi and 
Qc, respectively. Since the latter sets are closed and convex (in fact, polyhedral subsets of the simplex 
AI^^I^^ their support functions satisfy the desired inequality if and only if the sets satisfy the required 
inclusion relation (see Corollary 13.1.1 in Rockafellar [40]). 

"(zm) 4» {iv)" Recall that Proposition 3.1 and Proposition 3.2 yield the representations /^/(y) = 
maXqgsub(Q^) q^Y and /xc(y) = niaXqgsub(Qc) q^Y , respectively. It can then be immediately seen that 
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sub(Qc') ^ sub(Q7) =^ A*c(^) ^ For the reverse direction, assume (by contradiction) that there 

exists q e sub(Qc')\sub(Q/). Since sub(Q/) is a down-monotone polytope, by Proposition 7.1 in the 
Appendix, it admits a representation of the form sub(Q/) = ^ : a^q ^ 5^, VA; 6 /C}, where K, 
is some finite index set, and ^0, 6^ ^ 0, \/ k e IC. Since q ^ sub(Q/), there exists k e IC such that 
g^ttfc > bk ^ q'^CLk, Vg e sub(Q/). In particular, we obtain the contradiction fic{o-k) ^ q^o-k > fJ'i{o-k)- 

"(i) 4» (iii)" Consider an arbitrary Y s X2. Since X2 is isomorphic with rI^^I^ there exists y 6 M 
such that Y ^ y lr22- Then, by the translation-invariance property ([P2]) of the risk measures, 



where Y ^=Y 



y ^ 0. The equivalence is then immediate. 



□ 



The lemma suggests that the question /ic(^) ^ /^/(^)) ^Y ^ X2 can be equivalently addressed by 
examining the inclusion of Qc and Qi or their corresponding down monotone closures (for an example, 
see Figure 3). Furthermore, the restriction y ^ is without loss of generality when examining this 
inequality. 




Figure 3: Inclusion relation between Q/, Qc (and the corresponding downward closures, sub(Q/) and 
sub(Qc'), respectively) that is equivalent to /Uc(y) ^ ^/(y), Vy 6 

The previous lemma prompts the natural question of what conditions on the risk measures [Iq and 
\ii would ensure that Qc ^ 2/ or sub(Qc') ^ sub(Q/). The following theorem provides a set of such 
relations, stated (equivalently) in terms of the Choquet capacities c, c\ and C2\i defining the relevant sets 
of measures. 

Theorem 3.1. Qc ^ Qi if and only if the Choquet capacities c, ci and C2\i appearing in (lib) and (13b) 
satisfy the inequalities 



2 [ci(u^^^Sfc) -ci{ulJ^Sk) ■C2\sAUs^) ^ c{uisniUi), 



(19) 



where (si, . . . , S|Qj|) denotes any permutation of the elements of Qi, and Ui Q '^i for any i e 
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Theorem 3.2 (Alternative Statement). Qc Q Qi if and only if the Choquet capacities c, ci and C2\i 
appearing in (lib) and (13b) satisfy the inequalities 



[ci{^LM^)) - ci{u{-^^a{k))\ ■ C2|,(,)(?7,(j)) ^ c{u,^n,Ui), Va e ^UiQ Vi 6 Q^. 



Proof. "<=" . Consider an arbitrary q e Qc and let p be a corresponding vector in the representation (13b). 
Since p 6 , it can always be expressed as a convex combination of the extreme points of the latter 
polytope. In particular, there exist convex weights {^a}creU{Qi)j such that p = ^creu(ni) ^<^P^ where 



denote the extreme points of (see Theorem 7.6 in Section 7.2). To see that q e Qj, consider an 
arbitrary V Q ^12, and note that it can always be written as y = Ui^n-^^Ui for some Ui Q 'i^i, \/ i e Qi. We 
then have that 



(13b) 



q{V)=Y,qm'Z'J]m-C24Ui)= H A,[X;p^(.)-C2|.«(t/.(.)) 



(19) 

^ c{V). 



Since this is true for an arbitrary q e Qc, we must have Qc Q Qi- 

Assume 3 a e n(r2i), and Ui Q % that yield a strict reverse inequality in (19). To reduce the 
notational burden, assume (without loss of generality) that a is the identity permutation, i.e., a{i) = 
i, Vz 6 Oi. To derive a contradiction, we construct age Qc\Qi- To this end, consider the extreme point 
p^ e Bci with components p'^ = ci(u^_-|^A;) — ci{<u'j^^k). Furthermore, consider the vector q e m|[^^' such 
that, for any i e Qi, its projection on the coordinates 'rfi satisfies q\^^- e Bp-.c^^^ and q\'^.{Ui) = pi ■ C2\i{Ui) 
(such vectors always exist, and can be obtained just as p*^ above) . It can then be seen that (q, p'^) satisfies 
the constraints (13b), hence q 6 Qc- However, q ^ Qj, since 

q{uien^Ui) = ihiUi) = [ci(u*fc^i/c) - ci(u*^I^\A;) ■ C2\i{Ui) > c{uien^Ui). □ 



i=l 



Unfortunately, without additional assumptions on the Choquet capacities c, ci,C2, the equations 
in (19) yield a system of O^dr^il!) ■ 2l^2l) conditions that must be tested. In some cases, however, these 
take a particularly simple form, that immediately lends itself to an interpretation. Section 4 discusses 
one such example, when all risk measures are given by Average Value at Risk. 



3.2 Finding the Optimal a* 

Lemma 3.3 and Theorem 3.1 provide a complete set of conditions ensuring that fj,c{Y) ^ ^/(y), Vy 6 
X2, {Y ^ 0). The former result suggests that this question is, in fact, equivalent to Qc Q Qi, and also 
equivalent to sub(Qc') ^ sub(Q/). For any such pair of risk measures, we can now address the second 
question in Problem 1, namely characterizing the smallest factor a such that 

fiiiY)i^a-fic{Y),\/YeX2,Y^0. 

Recall that, in order to even find a feasible a for the problem above, we must operate under Assump- 
tion 2.1, namely that Y ^ 0. In view of this observation and the result in Lemma 3.3, finding a feasible 
(minimal) a in Problem 1 is akin to finding a feasible (minimal) scaling of the polytope sub(Qc') that 
would contain the polytope sub(Q/). The following lemma formalizes this characterization. 
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Lemma 3.4. IfY^O, then the minimal scaling a* in Problem 1 is given by 

^ maxqg3„b(g^) q Y ^ ^ ^ snh{Qi) Q a ■ sub(Qc) }• (20) 
y^o maXqesub(Qc) q^Y ^ 
Proof. First note that any feasible a must satisfy 

maxq-^y ^ a ■ max q^Y , Vy ^ (since Y ^ 0) 

qeQi qEQc 

^^max^eS^^^^^Q ^ 
maxqeg^ y 

maXqeQ^ q'^Y 

a = sup Tf^. 

Y^o maXqsQa q^ Y 

Furthermore, it can be argued (see Theorem 7.1 in the Appendix) that 

maxq-^y = max q^Y , \/Y ^ 0, 

qsQi qesuh(Qi) 

and similarly for Qc and sub( Qc"), which completes the first claim in (20). 

In this setup, our problem bears a strong connection with the question of determining the strength 
of valid relaxations in integer programming problems. More precisely, it is known (see Theorem 7.2 in 
the Appendix and Goemans and Hall [28]) that, if P and Q are full-dimensional downward monotone 
polytopes in with Q Q P, then P Q aQ ii and only if, for any w e M", max{w^x : x e Q] ^ 

-maxjiu^s : xe P\. Moreover, the minimal such IS given by sup^i.^nj^ r — ^ tvt * 

a I J ' a J r-tDtK_,. raax\w^ x:xSQ\ 

Here, sub(Q(7) — sub(Q/), and both are down-monotone polytopes in rI^^I^ Therefore, by applying 
the results in Theorem 7.2, we immediately recover the second part of (20). □ 

With these results in place, we can now provide the following concise characterization for a*. 
Theorem 3.3. The smallest value of a such that m{Y) ^ a ■ ^c'(y), Vy ^ is given by 

a = max max --— . (21 

qeext(Qj) 5<=f7i ci(5) 

Proof. Consider an arbitrary q 6 sub(Q/). By (20), we have that any feasible scaling a must satisfy, for 



1 , . (18) 1^,1 { Pis)^c^{s),^s^n 



-qesub(Qc) ^ 3peM^'^i : { i 

" [^(liU)^Pi-C2\i{U),\fUQ%,\/i€ni. 

The second set of constraints implies that any feasible p above satisfies Pi^ max^c<^i c^^iu) ' ^ ^i- 
Corroborated with the first set of constraints, this yields 



- Vmax-^^ ^Vpi^ ci(5), V5c 



a ^ max 



^.^^maxf/e^^, 



sc^i ci(5) 



S^^^maxye-af, ^^^^ 



Therefore, the smallest possible a is given by a* = maXqgsub(Q/) max^cni — ■ '^^^ 

result (21) is then immediate, by recognizing that the function maximized over q is nondecreasing in the 
components of q (so that sub(Q/) can be replaced with Q/), and it is also convex in q (hence reaching 
its maximum at the extreme points). □ 
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From the above representations, it can be immediately seen that the optimal a* is at least 1, and can 
be +00, whenever the dimension of the polytope Qc is strictly smaller than that of the polytope Qj (also 
see the discussion at the end of Section 7.1). This can be avoided by making the following assumption 
concerning the consistent risk measure /xc. 

Assumption 3.1 (Relevance). The Choquet capacities ci, C2\i appearing in the representation (13b) for 
the set Qc satisfy the properties 

ci{{i}) > 0, Vi 6 ni 
C2|i({j}) >0, VieJ^i, Vje^,. 

Assumption 3.1 ensures that the consistent risk measure considers all possible outcomes in the scenario 
tree (otherwise, certain elementary outcomes would never materialize, since all product measures would 
assign zero probability to them). From an axiomatic perspective, it is in line with the original requirement 
of relevance in Artzner et al. [6], which states that, for any random cost Y such that Y ^ and Y 0, 
any risk measure n should satisfy ii{Y) > 0. 

As a consequence, the polytope sub( Qc") is full-dimensional in rI^^I g^n be readily checked that 
the point SieHi '^je% C2|i({j}) is in its strict interior), which implies that a* is finite. 



4 Exact Analytical Values and Bounds for AVaR 

In this section, we discuss a special case of risk measures, in which an exact analytical expression is 
attainable for a*. We consider a two-period scenario tree, where, = A^, and = N, ^ i e ili. In 
order to simplify the discussion, and without loss of generality, we denote the nodes in ili by {1, . . . , A^}. 
The reference probability distribution on the scenario tree is the uniform measure. The inconsistent 
measure fij is given by AVaR at level e e [0,1], and the conditional risk measures appearing in the 
representation for nc are also given by AVaR. More precisely, ni in (14) is given by AVaR^-^, and ii2\i 
in (15) is equal to AVaRgj, Vi 6 where £1^2 ^ [0, 1]. 

As discussed in Section 2.2, when P is uniform and the risk measure under consideration is AVaR, 
we have c{S) = min(l, ^^^j^), and the following simplified expressions result for the sets of measures 
representing m and fic 

Ql = \qe A^' : q{{j}) ^ Vj e 1^2} (22a) 



Qc 



q{{j})^p^■J^^,yje%,\/ien^ 



(22b) 



It can be readily seen from (22a) that, if e ^ the set Qj is identical to the entire simplex A^^, 
rendering the case e < effectively analogous to e = Therefore, without loss of generality, 

we restrict attention to the case e ^ A similar reasoning for Qc allows restricting attention to 

{£1,62} ^ jf- The above equations lead to the following compact version of Theorem 3.1. 

Theorem 4.1 (Theorem 3.1 for AVaR). If e ^ £162, then ficiY) ^ ^iI{Y), e X2,Y ^ 0. Further- 
more, if Et ^ jj, ^ t e {1,2}, then the reverse implication also holds. 

Proof. As per Lemma 3.3, we have that Hc{Y) ^ ni{Y), ^YeX2,Y^0is equivalent to Qc Qi- To 
see the first claim, consider an arbitrary q 6 Qc, and let p be such that (p, q) satisfy the requirements 
in (22b). Then q{{j}) ^ ^ ^ ^, hence q s Qj. 

The second claim is proved using the contrapositive. If ej ^ and e > £162, we can show the 
existence of a vector q in Qc\Qi- Using the alternative description for set Qc in (16), consider a p s Qi 



18 



that has [iVeiJ components with a value of an additional component with a value of 1 — and 
zeros as the remaining components. Further, consider the following set of vectors, 6 Q2\i, Vi 6 f^i, 
such that each q*|<^j has [A^e2j components with a value of an additional component with a value of 
1 — and zeros in the remaining components. It can be seen that the corresponding q = XlieHi Pil^ 

belongs to Qc, but q ^ Qj, since it contains components with value ]v2^r^) and e > £162. □ 

Before characterizing the price of dynamic consistency a* , we first establish an interesting property 
enabled by the simplifying assumptions in the present section. 

Lemma 4.1. Consider a discrete set Q and a non-negative vector f 6 rI^L Let k be a non-negative 
parameter such that k ^ Then, the following ratio-maximization problem is resolved as 

^TpjsW = max! A: • max{ },J]fs}. (23) 



max ■ 



Proof. Consider the permutation of the elements of O, say, {si, S2, ■ ■ ■ , S|n|}, such that fs^ ^ fs2 ^ ■ ■ ■ fs^n\ ■ 
We first claim that, for any m6{l,...,|r2|}: 

max = (24) 

S<=n:|5hmmin(l, M) min(l, ^) 

The denominator in the objective function depends only on the cardinality of the set S, and is therefore 
a constant under the constraint IS*! = m. The numerator is maximized by picking the highest m elements 
from the set O. Consequently, it suffices to consider (24) for each m 6 {1, 2, ... , and pick the subset 
{si, . . . , Sm} , since it yields the highest ratio. Note that for m ^ [A;], the denominator is unity, while for 
m ^ [k\, the denominator takes a value of Due to the mediant inequality^, we also have 

. ^ {fs, + fs,) ^ 

"'"^ " 2 " ■ • • [k\ 
Since / ^ and fs^ = maxsen{fs}, this immediately leads to the above claim in (23). 

□ 



4.1 Analytical Expression for a* 

In this section, we characterize the price of dynamic consistency a* for the special case of a two-stage 
AVaR. The following result provides a first step towards this end. 

Theorem 4.2. The optimal scaling a* for the two-period case o/AVaR is given as 



a* = max max.] Nei ■ ma,x\Vi{q)}, / Vi(q)>, (25) 

qeext(Qi) (. ieUi • o 



where Vi{q) = max{q{'Wi), Ne2 ■ max(q|<:^.)}. 

Proof. For the two-period case of AVaR, Theorem 3.3 yields the following expression for a*: 



a = max max r-^. . (2d) 



'If ft (] > then ^ > ^ => ^ > > ^ 
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Consider a fixed q e ext(Q/), a fixed S Q Qi and a fixed i e S. The corresponding inner ratio 
maximization problem may be resolved using Lemma 4.1 as 

r^^J ir/l N = max{q(<ri), TVea ■ max(q|.^J} (= Vi{q)). 

f/c<r. min(l, ^) 

Similarly, for any fixed q 6 ext(Q7-), applying Lemma 4.1 yields 

max ^'f^js\ = ma^f^ei ■ max{y,(q)}, J] ^.(q)}, 
min(l, J^) ^ ^^^^ J 

which immediately leads to (25). □ 

We next characterize the structure of the extreme points of the set Qj in (22a). Recall from Propo- 
sition 3.1 that Qj is identical to Be, i.e., it is the base polytope corresponding to the Choquet capacity 
c. Therefore, by Theorem 7.6, the extreme points of Qj are given by 



X. 



c{{a{l), . . . ,a(i)}) - c({a(l), ...,a{i- 1)}), i 6 {1, . . .,N^}, 



where a is any permutation of the elements of ^2- Since c{S) = min(l, -j^) for any S Q it can be 
immediately seen that any extreme point of Qj is a vector in that has [A^^eJ components with a 
value of one component with a (lower) value of 1 — ^^^^J , and zeros for all the remaining components. 
Letting Y denote the set of all distinct permutations of such vectors in M^^, it can be readily seen that 
"V = Ql. Therefore, the expression (25) can be written (by switching the order of the max operators) 

a* = max< A^ei • max( max{Vi(q)} ) , max V VAq)) . (27) 
( qer Vieni 'J Q^'^ ~^ ) 

We now prove a set of lemmas concerning the different terms appearing in (27), which will lead to the 
analytical solution for a* . 

Lemma 4.2. The optimal value o/ maxgg-/' ^maxjgfj^ |Vi(q)}^ is given by 

maxfmax{yj(q)}^ = i ' ^ N 

qerXieQi^ lmax(l,^) otherwise. 

Proof. Due to the symmetry of the problem, it suffices to consider max^gy Vi{q) for a given (fixed) 
i e Ql. Since Vi{q) = max|g(^j), Ne2 • max(q|<^.)}, the maximum is trivially reached when q\^^- has each 
component as large as possible. If e > then [A^^eJ ^ N, so that q\^^- has the highest possible value of 
for each of its components, resulting in Vi{q) = If e ^ the optimal g|<^. contains all the 
[A^^e] ^ A^ non-zero components of q, resulting in maXqg;^ Vi{q) = max(l, j^)- □ 

We next consider max^gy ^^ieQi ^^(q)' ^^'^ derive the following supporting result. 

Lemma 4.3. The optimal q that maximizes XljeQi ^(^) ^"^ such that it maximizes"^ XlieQi -"-{9(^)7^0} • 
Further, if e > jj, the optimal q has a component in each separate q\'g^, isVli. 



^Fox a boolean proposition A, we use \a to denote the indicator function, \a = 1 if .4 is true, and otherwise. 
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Proof. We prove by contradiction, using an interchange argument. In particular, assume that the optimal 
q is such that XlieQi ■"■{qf^O^o} ^ ^ 3j e f^i such that q\'^. has at least two nonzero components. 
Without loss of generality, let £ 6 n be such that qg > 0, and let /c 6 fix be such that g|<:^j. = 0. 
Consider an arbitrary i € Q2 f^'^k, and a new vector q = q — + qi l^^j (i.e., the nonzero component q£ 
transferred from index £ to index i). It can be immediately checked that q e Qj, and Vi{q) = Vi{q) for 
any i j, k. Furthermore, since q\<:^. contained at least two nonzero elements, Vj{q) — Vj{q) ^ qi, while 
Vkiq) - Vk{q) = -g^-max(l, A^e2) ^ -Qi- Therefore, Xiief^i ^«(^) ^ T^ien^ which is a contradiction. 

If e ^ j^, there are at most [A^^e] ^ N nonzero components in q, and the above argument implies that it 
is optimal to have at most one such component in each "^j, i e ^li, resulting in Xlie^i ^«(^) ~ inax(l, Ne2). 

If e > the above argument implies that it is optimal to have at least one non-zero component 
in each q|<^j, i s Vli. In this case, it can be shown that the optimal q has a component in each 
i s VLi. We omit the details here, but note that the proof is again by contradiction and involves a 

similar interchange argument as above (between the fractional 1 — ^^2^} component in a particular q\'^. 
and a component in a q\^gj, j ^ i that necessarily contains at least two components). □ 

Due to Lemma 4.3, in the case that e > we only need to specify the distribution of the remaining 
[A'"^e] — N nonzero components to fully characterize m.&y.q^y '^^^ following result achieves 

this. 



Lemma 4.4. Let n* = 



N-1 



. If £ > j^, there exists an optimal q that maximizes XlierJi ^ii^); such 



that: 

(i) q^g. has a component for all i e Vti. 

(ii) Ifn* ^ 1, q|c^^ = Vie {l,...,n*}. 

(Hi) contains Nl 1 + ([A^^eJ — A^) mod (A^ — 1) components equal to and one component 

equal to 1 — ■ 



Proof. The proof is by contradiction, which is derived using an exchange argument. 

(i) . This is a direct consequence of Lemma 4.3. 

For proving (ii) and (Hi), we first note that XljeO ^(q) invariant with respect to permutations of 
elements in Qi, as is Vi{q) with respect to permutations of elements in ^j. Therefore, without loss of 
generality, we may restrict attention to those q e y such that q{^i), and hence implicitly Vi{q), form a 
non-decreasing sequence across i 6 ili, i.e., 

q{'^i)^q{'^2)^---^q{'^N). (28) 

(ii) . We argue that, in case (ii) is not true in an optimal q, a solution with objective at least as 
large as q can be constructed, satisfying (ii). To this end, consider a restricted optimal q satisfying (i) 
and (28), and assume that (ii) is true only for i e {1, . . . , m}, where m < n* . Such a case is possible only 
if 1 ^ n* < A^, which implies that (A^ — m) ^ 2. Also, since (ii) is satisfied only for m < n* subtrees 
"ifj, due to the pigeon-hole principle, ql^'^,^.^ and g|<rm+2 contain at least two components. Consider 
a q resulting obtained by exchanging a component with value < in q\'g^_^_^ with a component 
in q\^i'^^2- Using the same argument as in the proof of Lemma 4.3, it can be seen that q may only 
increase XlieHi compared to q. Relabeling the indices in {m -I- 2, ... , A^} as needed to ensure that 
q satisfies (28), and repeating the above exchange argument as many times as needed eventually results 
in a g that satisfies (ii). 

(Hi). The proof involves the same interchange argument as above, between and g|<^„*_^2. We 

omit the details due to space considerations. □ 

These results lead to the following main theorem concerning an analytical value for a* . 
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Theorem 4.3. If s > the optimal scaling a* is given by 



a* = max|y,/(iV,e,e2)| 



(29) 



where, f{N, e, £2) 



N-1 



+ N 



[N^e\-N 
N-1 



W^, and 



Nl = 1 + ([iV^eJ - N) mod {N -I). In the above case, the optimal scaling a* is upper bounded by 

ei 82 , (l-e2)(iVe-l)) 



auB(ei,e2) = maxj^, + 



{N-l)e 



(30) 



and the bound becomes tight if \N^e\ = N'^e and 
scaling a* is given as 



[N^e\-N 
N-1 



■ Lastly, if e ^ j^, the optimal 



a* = max ^Nei, ^— ^, Ne2] 



(31) 



Proof. Consider the case of e > Equation (27) captures a* as the maximum of two arguments. 
Lemma 4.2 specifies the maximum possible value for the first argument, which can be seen to be 
Lemma 4.4 specifies an optimal q e Y that maximizes the second argument Yuien ^(9)- We may 
therefore write a* in this case as, 



a* = maxi — , ^ max{g*('^i); • max(q*|<:^-)} >. 



(32) 



Applying Lemma 4.4, it can be seen that the value of the second argument in (32) 

Ne \Ne' N^e 



N -1 
[N'^el - N 
N-l 



+ N 



iV- 1 
[N^e\ - N 
N-l 



1 



£2 
Ne 

Ne' 



When the maximum in the middle expression is the first term (i.e., 0), we get the value 



£2 



+ 



[N^e\ - N 
N-l 



I- £2 ^ £2 {Ne-l){l-e2) 
Ne e {N-l)e 



which exactly corresponds to the bound in (30). When the maximum in the middle expression is the 
second term, the expression becomes 



_ ^2 


+ 


[N^e\ 


- N 


e 


N- 


1 


_ '^s 


+ 


[Nh\ 


- N 


e 


N- 


1 




+ 


[N^e\ 


- N 


e 


N- 


1 




+ 


'[Nh\ 


- N 


e 


N- 


1 



1-62^ A^^^ [N'£\ 



£2 



Ne ' N'^e N'^e Ne 

[N^e\-N-iN-l). 

Ne ^ 



[N^e\-N 
N-l 



N^e 



+ 1 



[N^e\ _^1-Ne2 



me 



N^e 



N{l-e2)- N + 1 ^ N^e -N ^1-Ne2 



+ 1 



N'^e N'^e 
1 - Ne2 N^e - N 
me ^ me ■ 



A2e 
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Using h N^e - [N'e\ and 



N-1 



the above expression becomes 



82 , ( N'e-h-N 

7 + 1 N-1 + ^ 



1 - A^e2 _^ A^^e - iV 



iV2 



iV2 



e 



/I 



£2 A^2£_7v l-iVe2 N^e-N / 
7 + Af- 1 + A^2£ + (^1 - 72 - ^^r^Y 

82 ^ (jV£-l)(l-£2) ^ A , /l 

7 + {N-l)e 



1-Ne2 



Ar2 



e 



1 - Afe2 
A^2e 



We claim that 1 - /2 - ^ 0. To see this, note that, since (A^ — l)/2 < — 1 and (A^ — l)/2 6 Z, we 
have 



Af- 1 

1 - /2 



1 



0^1-/2 



A^- 1 

fi 
N-1' 



(since /i e [0, 1)) 



/]_ ^ . 1 Ne2 ^ hence the expression for a* is again 



Therefore, since 1 ^ A''e2, we have that ^1 — /2 — jjzii j • jy2g 
upper bounded by (30), which completes the proof for the first claim of the lemma 
If e ^ jr, (27) simplifies due to Lemma 4.2 and Lemma 4.3 as, 



a = max 



This concludes the proof of the theorem. 



□ 



4.2 Optimal Design of Dynamically Consistent AVaR 

The previous results provide exact characterizations of the price of dynamic consistency, a* , for any fixed 
choice of AVaR risk measures (as specified by e and £1,2)- From a practical perspective, a natural question 
would be to look for the tightest possible approximation of the naive AVaR by means of a dynamically 
consistent AVaR. More precisely, for a fixed e and A^, we may seek the £1^2 resulting in the smallest 
possible price of dynamic consistency a*. The following set of results provides a characterization of these 
choices, and a discussion of the insights. 

Lemma 4.5. When e ^ j^, the smallest a* is obtained by taking £1 = £2 ~ V^- 

Proof. Theorem 4.3 states that a* = max(A^£i, Ne2), whenever £ ^ Since we always require 
£i£2 ^ £ (so as to have fic ^ fJ-i), it is optimal to set £^£2 = £• By the symmetry of the expression for 
a* , this implies that £* = £2 = always results in the smallest possible price of dynamic consistency, 
namely a* = N-^/e for < £ ^ and a* = 1 for £ ^ □ 

For the remainder of the analysis, we work with the upper bound auB provided by (30). The following 
lemma summarizes the necessary results for the case of £ > . 

Lemma 4.6. The smallest value o/auB is given by 

2N{l-£) 



l-Ns + Vl + A^2e(4 - 3£) + 2A^£(-3 + 2£) ' 



(33) 
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and corresponds to setting 



^ 2Ne{l - e) 

1 - A^e + Vl + - 3e) + 2A^e(-3 + 2e) 

_ 1 - A^e + Vl + A^^e(4 - 3e) + 2Ne{-3 + 2e) 



^ 2iV(l-e) 
Proof. The optimal auB is obtained from the fohowing optimization problem 



'l £ ' £ {N -\\e ] 



(- £ £ (iV- 1)£ 

S.t. £i£2 ^ £ 

£1,2 6 [0, 1]. 

It is easy to see that the objective is a convex function of e\ and £2- Furthermore, since £ ^ 1 ^ A^£ — 1 ^ 
— 1, it is also non-decreasing in e\ and £2. Therefore, the constraint £i£2 ^ £ will hold with equality at 
optimality (otherwise, we would decrease both e\ and £2, yielding a strictly smaller objective), yielding 



f 1 £2 ^ (1-£2)(A^£- 

iTjR = mm max< — , -— — — 

e26[0,l] U2 e (iV-l)£ 



Since the first term is non- increasing in £2, while the second is non-decreasing, the value £2 yielding the 
unconstrained optimum is given by 

1 _ £2 (1 -£2)(A^£- 1) _ 1 - iV£ ± Vl + ^^^e(4 - 3e) + 2iV£(-3 2£) 

£^~7^ (A^- 1)£ ^^2- 2A^(1 -£) ■ 

It can be easily checked that the solution with a -I- sign is feasible for the [0, 1] bound constraints, 
and hence provides the overall optimal choice. The corresponding e\ and a^g can then be immediately 
obtained. □ 

The optimal choices are depicted in Figure 4. As can be seen from the plots, the optimal e\ 2 (for 
£ ^ have a weak dependency on A^, quickly converging to the values 

-* r *^/\n 2(1 -£)V^ V(4 - 3£)£ - e 
£1 = hm £i(A^) = -, £2 = hm £2(A^) = ^-itt. , 

Af^co y4 — 3£ — V£ Af^co 2(1 — £j 

which yield the corresponding price of consistency a* = ."^^ '^^ — . It is interesting to note that the 

Y (4— 3e)e— e 

choice of parameters e\ 2 yielding the smallest price of dynamic consistency always satisfies £* ^ £2, i.e., 
more conservative risk assessments in the second period. However, the exact split is dependent on the 
level of risk-aversion in the naive (static) assessment - if the latter is quite conservative [e ^ -^), then it 
is optimal to take £^ = £3, i.e., measure the risk using the same function in each period. 



5 Bounds For Multi-stage General Risk Measures 

In this section, we extend the results from previous sections to general distortion risk measures and more 
than two stages. While the ratio q* can be expressed analytically for problems with 2 stages and AVaR 
risk measures, computing it becomes significantly harder for arbitrary distortion risk measures and more 
stages. We show, in particular, that it is in general NP hard to solve Problem 1 even for a fixed number of 
stages. Solving Problem 1 is, however, tractable when the consistent risk measure is AVaR with uniform 
reference probability measures. 
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(a) (b) (c) 

Figure 4: Choices resulting in tightest bound: ei (a), £2 (b), and resulting a^g (c). 

5.1 General Distortion Risk Measures 

This section extends the results from Section 3 to general risk measures applied to multi-stage dynamic 
settings. Recall from Definition 2.4 that a dynamically consistent distortion risk measure for T stages is 
defined by 

/^c(^) =^ (/il o /i2 o • ■ ■ o ^r)(>") • 

The following theorem is then a T-stage equivalent of the optimization in Theorem 3.3. 

Theorem 5.1. The smallest value of a such that m(Y) ^ a ficiY), \/Y ^ for T stages is given by 

a = max max — zAi.q) = max — — — -— te\l,T\ (35) 

qeext{Qi) S^Ql Ci{S) U<^% Ct\i[U) 

and ZT{i, q) '= Qi for all i e Qt- 

Theorem 5.1 follows by induction using an identical argument as Theorem 3.3. 

Solving Problem 1 entails solving (35); we show that this problem is NP hard to solve for general risk 
measures, even when T = 1. The hardness result follows by a reduction from SUBSET-SUM, which is 
defined as follows. 

Definition 5.1 (SUBSET-SUM). Given a set of integers {ki, k2, ■ ■ ■ , km}, is there a subset that sums to 
s? 

The SUBSET-SUM problem is NP hard [14]. To reduce SUBSET-SUM to computing a*, we use the 
Choquet representation of risk measures described in Theorem 2.2. To that end, assume any submodular 
set function c and a probability measure P such that: 

c(0) = O c{nT) = l c{S)=g{F{S)) V5 c J7t, 
for some concave function (7 : M ^ M. Then, as discussed in Section 2, the function fj, : X —^M defined by 

^i{Y) = maxjq^l" : q{S) ^ c{S), V5 c J^g} , 

is a distortion risk measure. We are now ready to prove the hardness of solving Problem 1. 

Theorem 5.2. Suppose a* is the optimal solution of (35) for general distortion risk measures and any 
fixed number of stages T ^ 1. Then, it is NP-hard to decide if a* = a for any a ^ 1. 
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Proof. Solving (35) for any T > 1 is at least as hard as for T = 1. This can be seen by setting \Qt\ = 1 
for all t 6 [1,T — 1]. The optimization problem (35) for T = 1 reduces to 

maXqgsub(Qj) Q^Y max esub(Si) Q^Y 

a = sup = sup Tfr:—, (36) 

Y^o maXqe<,ub(Qc) ^ Y y^o maXqesub(S2) Q Y 
where both Qi, Q2 correspond to one-step risk measures. Using the representation of Qi, Q2 in Proposi- 
tion 3.1 we have: 

Qi = I qe aI^2| . ^ ^.(5)^ V5c I for i = 1,2 . 

The functions Cj are, in particular, of the form Ci{S) = gi{¥{S)) where gi^s are concave functions and P is 
a measure. Because both sub(Qi) and sub(Q2) are polymatroids and downward monotone. Theorem 7.2 
can be used to further simplify the optimization of a* to: 

max esub(Qi) q'^Y ci(5) 

a = sup ^ — - — 7fT:r- = max — — — . (37) 

K^o maXqgsub(Q2) ^ Y s^n^ C2{S) 

Now, assume a SUBSET-SUM problem with values ki,k2 ■ ■ ■ km and a value s such that ^ s ^ 1, then 
construct the functions ci and C2 as follows: 

m 

ci{S) = min|p(5)/s,l| 
C2{S) = ^/¥{S) 

The optimal value of this problem is l/\/s if and only if there is a subset that sums to s. Theorem 7.2 
requires that sub(Q2) ^ sub(Q2), which holds trivially since 02(8) ^ ci{S) for all S Q fii. Since both ci, 
C2 satisfy the conditions of distortion risk measures, any SUBSET-SUM problem can be reduced to the 
problem of computing the optimal scale of two distortion risk measures. □ 

Solving (35) may be difficult because of two main reasons. First, the set ext Qj is combinatorial 
with exponentially many vertices. Second, the maximizations max^/c'g'i XlieC/ ^^^^ 
exponentially many subsets. The following lemma indicates that the NP hardness is most likely due to 
the former reason, not the latter one. 

Lemma 5.1. Consider the specialization of (35) for some fixed q 6 Qj: 

max ^^^sM^,ci) .,(z,q)^=Vax^M^f!±l(!l^ t e [1,T] . (38) 

s^Qi ci{S) v^^i Ct\i{U) 

This problem can he solved using the Ellipsoid algorithm in time polynomial in N for any fixed T. 

Proof. The optimization problem (38) for two stages can be simplified to: 

TjieS^i 9(^) 
max 'y" Zi = max — — . 

Ci[S) C/c-r, C2\i{U) 

Each value Zi can be computed as: 

Zi = max = mini; eR : I- C2\i(U] - q(U) ^ 0, V C/ c 1 . 

This optimization can be solved in polynomial time by bisection on / starting with 1 = 1. For any /, 
the constraint I ■ C2\i{U) — q{U) ^ 0, ^ U Q 'rfi can be checked in polynomial time, since the function 
^ ■ (^2\i{U) — q{U) is submodular and can be minimized by the Ellipsoid method [51]. The problem 
max^eOi YjieS ^i/'^i-i^) can then be solved similarly using the linearity of YjieS with respect to S. This 
method can be readily extended to any number of steps T by induction. □ 
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5.2 Uniform AVaR Measures 



In this section, we extend the results from Section 4 to T stages. In particular, we consider a uniform ref- 
erence measure, an arbitrary inconsistent distortion risk measure fij : M^^ M and a T-stage consistent 
risk measure of the following form: 

fic{Y) = (AVaRe, o . . . o AVaRe^)(y) . 

The following result can be readily derived by induction using the same argument as Theorem 4.1. 

Theorem 5.3. Suppose that fii{Y) = AVaRe(y). // e ^ ei • £2 • • • • • ^T, then fJ.c{Y) ^ w(^) for all 
Y € X . Furthermore, ifet^j^,yt = l...T, then the reverse implication also holds. 

Theorem 5.3 addresses only one part of Problem 1 for T-stage AVaRs with uniform measures. To 
solve Problem 1 completely, one also needs to compute a* — the following theorem shows how. 

Theorem 5.4. Define for each t e [0,T — 1] and i e Qf' 

ht{i,q)=max{ft{i,q),gt{i,q)} ft{i,q)= Yht+i{j,q) gt{i,q) = Nst+i ■ raaxht+i{j,q) 

and hT{i,q) = Qi for each i 6 ^It- Then, the optimal scaling a* for the T-stage special case with fic o-s 
AVaRs is given by 

a* = max/io(«0)Q) = max ho{io,q), 

qeQi <76ext(Q_r) 

where io 6 is the root node. 

The theorem follows from (35) by induction and an identical argument as Theorem 4.2. 

Computing a* by maximizing ho from Theorem 5.4 is nontrivial, since ho{io, q) is piecewise linear in 
q with an exponential number of linear pieces. The number of linear pieces is exponential because any 
function: 

JV TV 
^0(1, q) = y max{/i(z,q),5ri(«,g)} = may. Y,{difi{i,q) + (1 -di)gi{i,q)) 

■H dE{0.1}'^ -H 

represents a maximum over exponentially many linear functions even when /i and gi are linear in q. 
However, we show next that there exists a function Jiq that is a maximum over polynomially many linear 
functions and has the same optimal value as /iq. The function ho is constructed by breaking symmetries 
induced by the uniform reference measure. 

To simplify the notation, we restrict our attention to a 3-stage setting and only discuss how to 
generalize the argument to T stages. We use tuples of indices for the ease of reference: that is we write 
{i,j,l) instead of simply / when i e 6 'rfi,l 6 ^j; for example q(ij^i) = qi and f2{{i, q) = f2{l,q)- 
We also assume that the elements of any are arbitrarily indexed by 1 ... for all i e 0,t and t 6 [0, T]; 
for example {i,j, 1) is a valid index and 1 e Qq. 

Figure 5 depicts how the function breaks the symmetries by imposing a particular order on the 
elements of q. This approach is possible because all the reference measures are uniform. The order we 
assume is such that ft{i,q) — gtihq) decreases with an increasing i. This structure makes it possible to 
exchange the order of sums and maximizations as in: 

N ^1 N 

^o(l,g) = y max{/i(z,g),5i(i,q)} = max V fi{i, q) + V gi{i,q) . 

The maximization over ^1 can be done in polynomial time. However, to show a polynomial-time solvability 
of ho, one needs to make similar assumptions in the definition of fi as illustrated by .^2 in Figure 5. The 
order for ^2 must apply to all children of /i(l, q) and /i(2, q) simultaneously; assuming such an order for 
each fi{i,q) independently is insufficient. The following lemma formalizes this idea. 
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9(3,1,3) 



Figure 5: Example of a single linear segment of Hq. The dashed lines represent the values and ^2 
respectively. The labels on nodes represent the largest function for the node, e.g. /2(1) means that 

/2(1)>52(1). 

Lemma 5.2. Suppose that T = 3, fj,c and fj,i are distortion risk measures with a uniform reference 
measure, and jic is in particular a composition 0/ AVaRs. Define ho{l,q) =^ max{/o(l, q), ^o(l) ^)} 
where: 

/o(1,q) = max max V V (/2((?, j),Q)lijv+j<ca + ff2((«,i), q)li7V+j>f2 ) 
+ iV£2- max 2 /2((i,l),Q)+ X! 92iiiA),q)] 

?2 N 

50(1, q) =Ar£i -max] niax (V /2((l,j),Q) + V ff2((l,i),q)) , 

N^eie2 ■ max{/2((l, 1), g),52((l, l),q)}} 
92{{i,j),q) = Nes ■ q^ij^i) y{i,j) e ^2 . 
The function ho{l,q) has the following properties: 
(i) It is a maximum over at most N ■ iV^ linear functions, 
(a) a* = maxqgQ^ ^o(1,q) = max^gQ^ hQ{l,q). 

Proof. The property (i) follows trivially from the definition of Jiq. Since /2 and ^2 are linear in q, the 

functions /q and % have at most N ■ N"^ and N linear pieces respectively. 

To show the property (ii), we use the fact that the sets Qj and Qc are invariant with respect to 
permutations of elements of q: 

q^Qi ^ q{cr) € Qi q s Qc ^ q(cr) € Qc , 

for any a 6 11(17^), where q{cr) represents a permutation of the elements of q. In addition, when a 
permutation a influences only the order of elements in 'rf^ for some i 6 and t e [0, T — 1], then 
hT{j,q) = hr{j,q{(7)) for any r e [0,i] and j e O^. 



28 



Now, assuming a given q* e argmax^gg^ ho{l, q), we show there exists q' = q*{a) for some a e Hiftr) 
such that /io(l,g*) = /io(l,q') = ^o(1,q')- We analyze the following cases separately, with (C2), (C3), 
and (C4) being mutually exclusive. 

(CI) For each t e [0, T — 1] and i e Clt such that gt{i, q*) > ft{i, q*), consider a permutation cri e II{^t) 

that moves the largest child to the first position in subtree such that ht+i{{i,l), q* {(Ti)) ^ 
ht+i{{i,j), q*{ai)) for all j s This can be done by separately permuting the nodes for every '^i. 
Suppose that q^ =' q*((Ti), then: 

=iV£t+i-/it+i((z,l),Qi) VjeJ^t, Vie[0,r-1] 
and 52(i, Qi) = 32(i, Qi) when i 6 O2. 

(C2) Whenc?o(l,qi) ^ /o(l, 9i) and c?i(l, g^) < g^), then define C72 e n(J^T) such that /2((1, j), Qi(c72))- 
5r2((l, j), Qi(o'2)) decreases with an increasing j for j 6 [l.iV]- This can be done by permuting the 
nodes in ^1. This permutation does not break the structure achieved by a\ in (CI) because it 
reorders the children of different nodes. Then: 

N 

/io(l, gi) ^ = ^ 5o(l, Qi) ^ = ^ iV£i ■ /i(l, gi) = iV£i ■ X! max{/2((l,i), Qi), 52((1, j), 9i)} 

6 N 



max 

66 



(Ci) 

6 



|^^i^^i-(S/2((i,j),gi(a2))+ X! 52((i,j),gi(a2)) 

6 N 

max iVsi- (X/2((1,J), 91(^2))+ X 52((l,j),Qi(a2))J 



Let qr' = 91(0-2). 
(C3) When 50(1, Qi) ^ /o(l,9i) and 51(1, Qi) ^ /i(l,qfi), them 

/i0(l,qi) ^ = ^50(1, Ql) ^ = ^ iV£i -51(1,91) ^^'^ iV2£i£2-/l2((l,l),gi) 

^ = ^ 7V2£i£2 . max{/2((l, 1), Qi),^2((l, 1), 9i)}. 

Let q' = qi- 

(C4) When gQ{l,qi) < /o(l,gi), then define (73 e n(fiT) such that fi{i,qi) — 9i{i,qi) decreases with an 
increasing i. This can be done by permuting the nodes in This permutation does not break the 
structure achieved by ai in (CI) since it reorders only nodes where ft dominates gt- Then: 

TV 

/io(l,g) = /o(l,9) = X ™^^i'^i(^'^2)>5'i(^>g2)} 

Ci N 
= max X -^i^^' ^2(0-3)) + X 51 92(0-3)) 

Let 92 = 91(^3)) assume a fixed .^1 and construct (T4 e II{0,t) such that /2((i,i), 92(^4)) ~ 
fl'2((^ j); Q2('^4)) decreases with a lexicographically increasing {i,j) for i 6 This can be done 

by permuting the nodes in ljjg|-i "^j. This permutation does not break the structure achieved by 
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o"! and (73 because it reorders children of different nodes. Then: 

Ci Ci N 

(CI) ^ 



XI ™ax{/2((i,j),Q2)>52((i, j),Q2)} 



X y X f/2((«, j)>Q2(0"4))liAf+j=s:6 +52((i,j)>92(c^4))ljAr+i>6 



max 



Let 93 = q2{^'i)^ assume a fixed ^i, and construct 6 n(f]T) such that /2((i, 1), q3((T5)) — 
52((^; 1))Q3(<75)) decreases with an increasing i for i e [^i + 1, A^]. This can be done by permuting 
the nodes [^i + 1,A^]. This permutation does not break the structure achieved by (7i,as and 
reorders different nodes. Then: 

AT AT N 

(Ci) ^ 

= N£2- J] max{/2((i,l),q3),52((i,l),Q3)} 

«2 N 



N£2- max 2 /2((i, 1), 93(^75)) + J] 5r2((^, 1), 93(0-5)) • 



Let q' = 93(0-5). 

Finally, combining conditions (CI) - (C4) gives us /io(l,q*) = /i.o(l,q') = hoi^^q') and q' e Q/. It can 
be also seen readily that ho{l,q) ^ ho{l,q) since /jq maximizes over a superset of linear functions of ho; 
for example: 

N N 

/io(l,q) = y max{/i(i,q),5ri(z,g)} = max y ((ii/i(i, g) + (1 - di)c/i(i, q)) 

N _ 
^ max y/i(i,q)+ y gi{i,q) = ho{l,q) . 

□ 

The technique described in Lemma 5.2 can be generalized to an arbitrary number of stages T. Such a 
generalization involves a function Tiq that is defined as a maximization over ^1 e [0, A^], . . . , ^ [0, N^]} 
instead of over {^16 [0,A^],^2 ^ [OjA^^]}- Using the representation in Lemma 5.2, we now show that 
computing a* is indeed tractable by solving a linear program for every linear segment of Jiq. 

Theorem 5.5. The optimization problem maxg^g^ /io(l, q) can be solved with N^^"^"^^ evaluations of 
linear programs. This runtime is polynomial in N for any fixed number of stages T. 

Proof. Lemma 5.2 shows that it is necessary to optimize only over Jiq, which has a polynomial number 
of linear pieces. For three stages, this means solving a linear program for each choice of ii,^2-,i'2 m the 
definition of Jiq. For example, maximizing /o of Lemma 5.2 entails solving the following optimization 
problem: 

"^1^(1] I]('^2((j,j),9)i«Ar+j=sc2 +ff2((i,j),g)i»Af+j>6) +^^£2- 2 M{iA),q)+ XI 52((i, i),g)) ■ 
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Since the set Qj is a polymatroid when /x/ is a distortion risk measure, this optimization problem can 
be solved in time polynomial in A^"^. There are, then, 0{{N + 1) • (A^^ + 1)) such linear program evalu- 
ations needed, one for each setting 6 [0, A^], . . . , ^ [0, A^"^]}, which shows the desired polynomial 
complexity. □ 



6 Conclusions 

In this paper, we examined two different paradigms for measuring risk in dynamic settings - a dynami- 
cally consistent (or time-consistent) formulation, whereby the risk assessments are designed so as to avoid 
naive reversals in the decision process, and a dynamically inconsistent one (which is easier to specify 
and calibrate from preference data). We discussed necessary and sufficient conditions under which the 
consistent assessment is always lower than the inconsistent one, and we characterized the price of con- 
sistency, i.e., the factor by which a scaled dynamically consistent formulation would also upper bound a 
dynamically inconsistent one. We furthermore discussed specific cases involving the AVaR risk measure, 
when the price of consistency can be analytically or computationally evaluated. 



7 Appendix 

7.1 Submissives, Downward Monotone Closures and Anti-blocking Polyhedra 

In the current section, we discuss the important notion of the down monotone closure of a polytope 
(also known as its anti-blocking polyhedron or its submissive) . Our exposition mostly follows Chapter 
9 in Schrijver [50], to which we direct the interested reader for a more comprehensive treatment and 
references to related literature. 

A polyhedron Q in M" is said to be down (ward) -monotone (or of anti-blocking type) if 

Q 0, Q Q M" , andO and x E Q imply y e Q. 

The following proposition summarizes a useful representation for down-monotone polyhedra. 

Proposition 7.1. A polyhedron Q in M" is down-monotone if and only if there is a finite set I of vectors 
{0'i}iex o-nd coefficients {bi}isx such that ai ^ 0, ai 0, bi ^ 0, Vi el, and 

Q = {^E 6 M" : afx ^ bi, Vi eX}. 

Proof. The proof follows closely from the definitions. We omit it here, and direct the interested reader 
to Schrijver [50]. □ 

We remark that, whenever Q is full-dimensional, the right-hand sides bi in the representation above 
can be taken to be strictly positive. 

For any polyhedron Q Q M", we can define its down-monotone closure (also known as its submissive) 

by 

svih{Q) = {yeRl:3xeQ,x^y}. (39) 

It can be easily checked that sub(Q) = (Q + M") n M" , and that sub((5) is full-dimensional if and 
only if Q\{x e M."- : Xj = 0} ^ 0, for all j e [l,n] (see Balas and Fischetti [8]). A very interesting 
characterization of the down-monotone closure of a polyhedron is possible in terms of the polar of the 
polyhedron P. However, since these results are not directly needed in our treatment here, we direct the 
interested reader to Fulkerson [26], Balas and Fischetti [8], Balas et al. [9] and Chapter 9 in Schrijver 
[50]. 

Under appropriate conditions, optimizing a linear function over a polyhedron Q is equivalent to 
optimizing a related linear function over the downward monotone closure of Q. The following result, 
which follows this paradigm, is very useful in our analysis. 
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Theorem 7.1. For any polyhedron Q Q M" and for any w ^ 0, we have 

T T 
max It) q = max w q. 

qeQ qesuh(Q) 

Proof. Clearly, since Q Q suh{Q), the left side is always at most equal to the right side. To argue 
the reverse, note that for any x s suh{Q), there exists qeQ satisfying q ^ x ^ 0, which implies 
w^x ^ w^q, Vit) ^ 0. Since this is true for an arbitrary x, the reverse inequality must also hold. □ 

Down-monotone polyhedra have been used for studying the strength of relaxations in integer program- 
ming and combinatorial optimization (see Goemans and Hall [28] are references therein). The following 
result is relevant for our purposes. 

Theorem 7.2. Let P and Q be two downward monotone polytopes in M", with Q Q P. Then 

1. P Q aQ if and only if, for any nonnegative vector w e M", 

max {w^x : x e Q} ^ — max {w^x : x 6 P}. 

a 

2. Letting a* denote the minimum value of a such that P Q aQ, we have 

^ max {c^x : X s P} 

a = sup ^ y, 

c^o niax jc^ X : x e Q\ 

where, by convention, g = 1- 

3. If Q = {x eWl : afx ^ 6j, V« eX}, where ai, bi ^ 0, then 

a = max — , where dj = max a,- x. 
ieX bi xeP 

Proof. Part (1) is exactly Lemma 1 in Goemans and Hall [28]. Since the latter reference omits a proof, 
we include one below, for completeness. follows trivially. "<=" Since Q Q P, it must be that a ^ 1. 
Assume (by contradiction) that 3x e P\aQ. Since Q is down- monotone, by Proposition 7.1, it can be 
written as Q = {x e M" : afx ^ bi, Mi s I}, where ai, bi ^ 0, Vi e I. Since x ^ aQ, there exists 
J 6 X such that clJx > abj. Since x e P, we obtain the desired contradiction, ^ maxjaja; : x e P} ^ 
-ajx > bj ^ maxja-fa; : x e Q}. 

Part (2) follows as an immediate corollary of Part (1). 

Part (3) is exactly Theorem 2 in Goemans and Hall [28], to which we direct the reader for a complete 
proof. □ 

The above result shows that a* can be -l-oo, which is the case if Q has a strictly smaller dimension 
than P (in this case, some bi are 0, while the corresponding di are strictly > - see Schrijver [50]). 
However, if Q is full-dimensional, a* is always finite. 



7.2 Submodular Functions and Polymatroids 

In this section of the Appendix, we discuss the basic properties of Choquet capacities in light of their 
connection with rank functions of polymatroids. The exposition is mainly based on volume B of Schrijver 
[51] (Chapter 44) and Chapter 2 of Fujishige [25] (Section 3.3), to which we direct the interested reader 
for more information. 

Consider a ground set Vt with \n\ = n, and let c be a set function on J7, that is, c : J-" i-^ M, where 
= 2^ is the set of all subsets of il. The function c is called submodular if 

c{T) + c{U) ^ c{T nU) + c{T uU),MT,Ue T. 
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The function c is called nondecreasing if c(T) ^ c{U) whenever T Q U Q i}. For a given set function c 
on Jl, we define the following two polyhedra 

Pc= {a;6Ml^l : x^O, x{S) ^c{S),\/ S qU} 
SVc={xeR\^\ : x{S) ^c{S),^S Qn}. ^^^^ 

Note that Vc is nonempty if and only if c ^ 0, and that EPc is nonempty if and only if c(0) ^ 0. 
These conditions are trivially satisfied in our exposition, since all set functions c of interest are Choquet 
capacities, i.e., by Definition 2.1, they are are nondecreasing and normalized, c(0) = 0, c(0) = 1. 

If c is a submodular function, they Vc is called the polymatroid associated with c, and £Vc the extended 
polymatroid associated with c. Note that a nonempty extended polymatroid is always unbounded, while 
a polymatroid is always a polytope, since ^ ^ Vi 6 0. The next theorem provides a very 

useful result concerning the set of tight constraints in the representation of £Vc- 

Theorem 7.3 (Theorem 44.2 in Schrijver [51]). Let c be a submodular set function on (7 and let x e £Vc- 

Then the collection of sets U Q satisfying x{U) = c{U) is closed under taking unions and intersections. 

Proof. Suppose x{T) = c(T) and x{U) = c{U). Then 

c(r) + c{U) ^ c{T nU) + c{T uU)^ x{T nU) + x{T u C/) = x{T) + x{U) = c{T) + c{U), 

hence equality most hold throughout, and x{T n U) = c{T n U) and x(T u U) = c{T vjU). □ 

A vector x e £Vc (or in Vc) is called a base vector of £Vc (or of Vc) if x{Q.) = c{Q). The set of all 
base vectors is called the base polytope of c and is denoted by Be, 

Bc = {xeM.^^^ : x{S) ^ c{S), \/SQn, x{n) = c(0) }. 

The following theorem summarizes several simple properties of Be, and its relation to £Vc and Vc- 

Theorem 7.4. For any submodular function c satisfying c(0) = 0, 

(i) Be is a face of £Vc, o,nd is always a polytope. 

(a) £Vc = Bc + M" , so that £Vc and Be have the same extreme points. 
(Hi) Vc = sub(i3c). 

(iv) For any A ^ 0, Bxe = A • Be, £V\c = A ■ £Ve, and V\c = A • Vc. 

Proof, (i) The fact that Be is a face of £Ve follows directly from the definitions. To see that Be is a 
polytope, note that, for any i e Q, Xj ^ c({i}), and Xi = x{Q) — x{Q.\{i}) ^ c(f]) — c(r2\{z}). 

(ii) "3" Follows trivially. Consider any y 6 £Ve- Without loss of generality^, assume y does 

not lie in the strict interior of £Ve, and let Xy {5 6 : y{S) = c(5)} denote the collection of 
sets corresponding to tight constraints at y. li VL e Xy, then y e Be, and the proof would be complete. 
Therefore, let us assume Q ^ly. 

We claim that there exists s e 0, such that s ^ S, VS 6 ly. To see this, note that, if any s 6 17 
were contained in some S s ly, then Q e ly, since the set of tight constraints is closed under union 
and intersection, by Theorem 7.3. We can then consider the vector y^^ = y + Xls for A ^ 0. It is 
easy to test that, for small enough A, y^ e £Vc. By making A sufficiently large, at least one constraint 
a set S containing s becomes tight, hence enlarging the set ly. Repeating the argument for the point 
y)^ recursively, we eventually recover a vector y that belongs to Be. Since y = y + ^ for some ^ ^ 0, 
we have that y € Be + M", which completes the proof of the first part of (ii). Since is a cone, and 

*Such a y can always be obtained by adding a certain ^ ^ 0, and if the resulting y + ^ G Be + M- , then also y e Bc + K- . 
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Be is a polytope, the representation exactly corresponds to the Motzkin decomposition of an arbitrary 
polyhedron, so that ex.t{£T'c) = ext(;Bc)- 

(iii) Follows immediately from (ii), since Vc = SVc nW^ = {Be + M") n = sub(i3c). 

(iv) Since A c is also submodular, the results immediately follow from the definitions. □ 

A central result in the theory of submodular ity, due to Edmonds, is that a linear function w^x can 
be optimized over an (extended) polymatroid by an extension of the greedy algorithm. The following 
theorem summarizes the finding. 

Theorem 7.5 (Theorem 44.3, Corollaries 44.3 a,b in Schrijver [51]). Let c be a submodular set function 
on O with c(0) = and let w e m||^'. Then the optimum solution of mayixeSVc w^x is given by 

x{si) = c({si, . . .,Si}) - c({si, . . . ,Si_i}), i e [l,n], 

where (si, . . . , Sn) is a permutation of the elements of such that w{si) ^ w{s2) ^ ■ ■ w{sn)- If c is 
also nondecreasing, then the above x is also an optimal solution to the problem maXxsVc w^x. 

Proof. The proof follows by duality arguments. We omit it here, and direct the interested reader to 
Schrijver [51]. □ 

In view of this result, the following characterization for the extreme points of B^ £Vc and Vc is 
immediate. 

Theorem 7.6. For a submodular set function c satisfying c{0) = 0, the extreme points of Be and £Ve 
are given by 

= c({ct(1), . . ■,cr{i)]) - c({cr(l),. . . , cr(i - 1)}) , i e [l,n], 

where a e 11(0) is any permutation of the elements of Cl. When c is also nondecreasing, the extreme 
points ofVe are given by 



c(Hl),...,a(0}) -c({a(l),...,a(i-l)}) iii^k, 
if i > A;, 



where a 6 n(il) is any permutation of the elements offl, and k ranges over [0,n]. 

Proof. For a complete proof, we direct the reader to Theorem 3.22 in Fujishige [25] and Section 44.6c in 
Schrijver [51]. □ 

The following corollary also immediately follows from the above result. 

Corollary 7.1. For any submodular c such that c(0) = 0, Be if and only if c is nondecreasing. 

Proof. "<=" Immediate, since Be is the convex hull of its extreme points, which (by Theorem 7.6) are 
nonnegative. Consider any two sets T U Q Q, and take a chain of sets S*! c 52 ■ ■ ■ 5'|[/'^t| 

such that Si = T and (Sify^T^i = U. By Theorem 7.6, there exists an extreme point x of Be having elements 
c(5j+i) — c{Si), i 6 [1, \U\T\ — 1] among some of its coordinates. Since a; ^ 0, we immediately obtain 
that c{U) - c(T) ^0. □ 
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